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NUCLEIC ACID DETECTION METHODS USING UNIVERSAL PRIMING 

The present application claims the benefit of applications of U.S.S.N. 60/180.810. filed February 7. 2000. 
and 60/234.731 , filed September 22. 2000 hereby expressly incorporated by reference. 



FIELD OF THE INVENTION 



The present Invention is directed to providing sensitive and accurate assays for gene detection, 
genome-wide gene expression profiling and alternative splice monitoring, with a minimum or absence 
of target-specific amplification. 

BACKGROUND OF THE INVENTION 

The detection of specific nucleic acids is an important tool for diagnostic medicine and molecular 
biology research. Gene probe assays currently play roles in identifying infectious organisms such as 
bacteria and viruses, in probing the expression of normal and mutant genes and identifying mutant 
genes such as oncogenes, in typing tissue for compatibility preceding tissue transplantation, .n 
matching tissue or blood samples for forensic medicine, and for exploring homology among genes 
from different species. 

Ideally a gene probe assay should be sensitive, specific and easily automatable (for a review, see 
Nickerson, Current Opinion in Biotechnology 4:48-51 (1993)). The requirement for sensitivity (i.e. low 
detection limits) has been greatly alleviated by the development of the polymerase chain reaction 
(PCR) and other amplification technologies which allow researchers to amplify exponentially a specfic 
nucleic acid sequence before analysis (for a review, see Abramson et al.. Current Opinion .n 
Biotechnology, 4:41-47 (1993)). 

specificity, in contrast, remains a problem in many currently available gene probe assays. The extent 
Of molecular complementarity between probe and target defines the specificity of the interaction. 



reaction temperature, and in the length of the probe may alter or influence the specificity of the 
probeAarget interaction. 

Genes in higher eukaryotes contain introns, which are removed during RNA processing to generate 
Mature funlna, mRNAs. In most cases, the removal of introns is efficient, and thus these sp cng 
events are constftutive. However, many transcripts are alternatively processed to generate mul^ple 
mRNAs from a single mRNA precurson(pra-mRNA) through the use of different 5' or 3 sp ce s.tes. 
exon inclusion or exclusion, and intron retention. The complexity of gene expression .s further 
increased in many cases by coupling alternative splicing ,^th alternative promoters and the use o 
alternative polyadenylation sites. Based on comparison among expressed sequence tags (ESTs) 
databases, it is estimated that as many as 30% of the genes in humans exhibrt alternative spl.c. g 
(Gelfand M S.. DubchaM-. Dra.yuk. I., & Zom. M. (1999). ASDB: database of alternatively sp eed 
en s NuCeio Aci^s Resesrc, 27:301-302.). Considering that one transcript often gives nse o more 
han two isoforms. the number of alternatively spliced mRNAs may surpass the total number of genes 
that are expressed in a higher euRan^otic organism. Because alternatively spliced transcnpts may 
encode protein isoforms that have distinct functions, it becomes a major challenge in ^^-t-ona 
genomics to relate a biological function not only to the expression of specific genes but also to the. 
soforms resulting from post-transcriptiona. processing. This is pa^cularly relevant to cancer research 
as molecular alterations during malignancy may result from changes not only in gene express.on but 
also in RNA processing. 

The functional consequences of alternative splicing plays a vital role in biology and medicine, with a 
number of well-known examples being illustrative. 

Epithelial cells secrete acidic Fibroblast Growth Factor (aFGF). which binds and activates ite receptor 
FGFR2 on the cell surface of fibroblasts. Conversely, fibroblasts secrete Keratinocyte Growth Factor 
(KGF) which binds and activates KGFR on epithelial cells. Interestingly, FGFR2 and KGFR are 
generated from the same pre-mRNA by alternative splicing (Miki T., et a.., (1992). Determ.nation of 
Ligand-binding specificrty by alternative splicing: two distinct growth factor receptors encoded by a 
single gene Proc. A/a«. Acad. Sci. USA 89:246-250). Such cell-specific alternative spl.cng must be 
tightly regulated because cells expressing both a growth factor and its specific receptor will be 
transformed to uncontrolled growth. 

A „u..er o, 3po^o«c ,e.u,a,o. s.oh as Bo,-x, Oad-4, and Ca.pase-2 ,1*. 1 1'^^'TiZ, 
generated b, a,.,r„a«va spllCng <,sv,av.ed b, *„g, Z. H.. Zhang, W. J., Ra=, V., » ^"f^^'^'^l^ 
Regulation of lch-1 pre-n,RNA alterna«ve splicing and apoptosis l=v n,ammal,an sptang factors. Proc. 



95-9155-9160) In each case, one form promotes programmed cell death and the 



regulating the ratio of these isoforms 

CD44 . an Important c.„ surface molecule InvcWed l„ ,.eue-specif» .arge*. o, T 

.K ,™m„„»svsteit, as well as in cell adhesion and Signal trausducton. The transcnpt 

rrorrrn::-^^^^^^^^ 

Meratons in CD44 splicing are amcng .hs bes,U,mcr marKe,. (reviewed by Goodr^n, S. & Tan^ D. 
n oirculn. s.au,s o, CD44 varian, isCorms as cance, diagnosSc marKers. H,.<opaO,ctogy 32.1- 
Tc 44 a ™- splicing eppea..o he regula.ed h, c.o«nes and h, »nccgen. and 
1 inclusion of a speclSc exon t^) was shown to cause Kimor melasUisls ,n a mode s^em 
,Gull .. al., , 4. A new vaHan. 0, glycoprCein CD44 confers me«s.a«c pCente, .o re. 
carcinoma cells. Ce// 65:13-24). 

AML1 is a »ansc,*p«on factor require for granulocyfe differenliatlon. The protein contains an N- 
Z n DNA Jng and protein dimeH.a«0h don,a,n, and a terminal transo.p.ona, aC^.^n 
7 in 20V Of actrte myelogenous leukemia (AMD patients, the N-tem,inal sequence of AMU ^ 
r : ::r: Ir domo^mes ^ -s,oca.on. Howeyer, . man, AML 

Ls. no Chromosome transiocaSon . detected, but a change in =«emat»e sp .ng of AMU 

p„...MA appea. instead. Mema.ye slicing results in a "--"^^^^ JZ^ 
shown to suppress granulocyte differentiation (Tanaka.T. et al.. (1995). AnAcuemy 
genrlu egula es hemopolCc myeloid cell dmeren«..on and transcdpSonal actyat n 
an;ionis.cal,y by two aiternatye spliced forms. EMaO ,4:341-350., Thus, some fraCon of AML 
cases may be tnggered by a malfunction in splicing control and reguiaton. 

,n conclusion, a«erna*e spiling Is associated w,h important t^loglcal eyents, "^tor 
the pa«em or alte,a«on o, a,tema«»e s,.icing may be markers for specrfic diseases and,or targets for 

disease prevention and intervention. 

AltematiyeRNA splicing la widespread in higher eukaryotlc cells end plays a vital role m gene 
risl ~ etecon and analysis o, alternate splicing current,, rely on RNase proteCon 
I rT PCR essays, which are labor in.ens.e, ,nef«cient. and low scale, especially ,n the era o, 

functional genomics. 



and RT-PCR assays, which are labor intensive, inefficient, and low scale, especially in the era of 

functional genomics. 

Accordingly, it is an object of the invention to provide a very sensitive and accurate approach for 
genome-wide gene expression profiling and alternative splice monitoring with a minimum or absence 
of target-specific amplification. 

SUMMARY OF THE INVENTION 

in one aspect the invention provides a method of detecting a first target sequence comprising a 
poly(A) sequence in a sample. The method includes hybridizing a first probe to the target sequence to 
form a first hybridization complex. The first probe comprises an upstream universal priming site (UUP . 
an adapter sequence, a first target-specific sequence, and a downstream universal priming s.te (DUP). 
The poly(A) sequence remains single-stranded. The method further includes contacting the first 
hybridization complex with a support comprising a poly(T) sequence, such that the poly(A) sequence 
hybridizes with the poly(T) sequence. In addition, the method includes removing unhybridized first 
probe sequences, denaturing the first hybridizaY.on complex, amplifying the first probe to generate a 
plurality of amplicons, contacting the amplicons with an array of capture probes to form assay 
complexes and detecting the assay complexes. 

m addition, the invention provides a method of detecting a first target sequence comprising a first 
target domain, a second adjacent target domain and a poly(A) sequence. The method includes 
hybridizing a first probe comprising an upstream universal priming site (UUP) and a first target-speafic 
sequence substantially complementary to the first target domain to the first target domain, and 
hybridizing a second probe comprising a second target-specific sequence substantially 
complementary to the second target domain and a downstream universal priming site (DUP), wherein 
at least one of the first and second probes comprises at least a first adapter sequence. The poly(A) 
sequence remains single-stranded, and the target sequence and the first and second probes form a 
ligation complex. The method further includes contacting the ligation complex with a ligase to form a 
ligated complex, contacting the ligated complex with a support comprising a poly(T) sequence, such 
that the poIy(A) sequence hybridizes with the poly(T) sequence, removing unhybridized first and 
second probe sequences, denaturing the ligation complex, amplifying the ligated first and second 
probes to generate a plurality of amplicons, contacting the amplicons with an array of capture probes 
to form assay complexes, and detecting the assay complexes. 



Figure 7 depicts a preferred embodiment of the Invention utilizing a poly(A)-poly(T) capture to remove 
unhybridized probes and targets. Target sequence 5 comprising a poiy(A) sequence 6 is hybridized to 
target probe115 comprising a target specific sequence 70, an adapter seqeuence 20. an unstream 
universal priming site 25, and a downstream universal priming site 26. The resulting hybridization 
complex is contacted w/ith a bead 51 comprising a linker 55 and a poly(T) capture probe 61. 

Figure 8 depicts a preferred embodiment of removing non-hybridized target probes, utilizing an OLA 
format. Target 5 is hybridized to a first ligation pmbe 1 00 comprising a first target specific sequence 
15, an adapter seqeuence 20, an upstream universal priming site 25, and an optional label 30, and a 
second ligation probe 110 comprising a second target specific sequence 16. a downstream universal 
priming site 26, and a nuclease Inhibitor 35. After ligation, denaturation of the hybridization complex 
and addition of an exonuclease, the ligated target probe115 and the second ligation probe 110 is all 
that is left. The addition of this to an array (In this embodiment, a bead array comprising substrate 40, 
bead 50 with linker 55 and capture probe 60 that is substantially complementary to the adapter 
sequence 20), followed by washing away of the second ligation probe 110 results in a detectable 
complex. 

Figure 9 depicts a preferred rolling circle embodiment utilizing two ligation probes. Target 5 is 
hybridized to a first ligation probe 1 00 comprising a first target specific sequence 1 5, an adapter 
seqeuence 20. an unstream universal priming site 25, an adapter sequence 20 and a RCA primer 
sequence 120, and a second ligation probe 110 comprising a second target specific sequence 16 and 
a downstream universal priming site 26. Following ligation, an RCA sequence 130 is added, 
comprising a first universal primer 27 and a second universal primer 28. The priming sites hybridize to 
the primers and ligation occurs, forming a circular probe. The RCA sequence 130 serves as the RCA 
primer for subsequent amplification. An optional restriction endonuclease site is not shown. 

Figure 10 depicts preferred a rolling circle embodiment utilizing a single target probe. Target 5 is 
hybridized to a target probe 115 comprising a first target specific sequence 15, an adapter sequence 
20, an upstream universal priming site 25, a RCA priming site 140, optional label sequence 150 and a 
second target specific sequence 16. Following ligation, denaturation, and the addition of the RCA 
primer and extension by a polymerase, amplicons are generated. An optional restriction 
endonuclease site is not shown. 8 

Figure 1 1 depicts alternative splicing targets selected for microarray analysis. Abbreviations are: 
GAPGH, glyceraldehyde phosphate dehydrogenase; FGFR2, fibroblast growth factor receptor; KGF. 



second target specific sequence 16. Following ligation, denaturation, and the addition of the RCA 
primer and extension by a polymerase, amplicons are generated. An optional restriction 
endonuclease site is not shown. 8 

Figure 1 1 depicts alternative splicing targets selected for microarray analysis. Abbreviations are: 
GAPGH, D-glyceraldehyde-3-phosphate dehydrogenase glyceraldehyde phosphate dehydrogenase; 
FGFR2, fibroblast growth factor receptor gene; KGF, keratinocyte growth factor; CASP refers to 
caspases; NOS, nitric oxide synthase; PCD refers to programmed cell death. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention is directed to the detection and quantification of a variety of nucleic acid 
reactions, particularly using microsphere arrays. In particular, the invention relates to gene detection, 
gene expression profiling and alternative splice monitoring, without prior amplification of the specific 
targets. In addition, the invention can be utilized with adapter sequences to create universal arrays. 

The invention can be generally described as follows. A plurality of probes (sometimes referred to 
herein as "target probes") are designed to have at least three different portions: a first portion that is 
target-specific to an mRNA target and two "universal priming" portions, an upstream and a 
downstream universal priming sequence. These target probes are hybridized to target mRNA 
sequences from a sample, without prior amplification, to form hybridization complexes. The 
hybridization complexes (and non-hybridized target mRNA sequences) are then removed. This is 
accomplished by using a polyA selection method such as the use of poly(T) sequences on a support 
that can specifically retain all mRNA including the hybrids. Once the unhybridized target probes are 
removed, the hybrids are denatured. All the target probes can then be simultaneously amplified using 
universal primers that will hybridize to the upstream and downstream universal priming sequences. 
The resulting amplicons, which can be directly or indirectly labeled, can then be detected on arrays, 
particularly microsphere arrays. This allows the detection and quantification of the mRNA target 
sequences. 

As will be appreciated by those in the art, the system can take on a wide variety of conformations, 
depending on the assay. For example, when expression profiling or altemate splice junction analysis 
is to be performed, a single target probe can be used. Thus, a single probe can be designed for any 
mRNA sequence, with an upstream and downstream universal primer. After separation of the 
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relies on the fact that two adjacently hybridized probes will be ligated together by a ligase only if there 
is perfect complementarity at each of the termini, i.e. at a detection position. In this embodiment, there 
are two ligation probes: a first or upstream ligation probe that comprises the upstream universal 
priming sequence and a second portion that will hybridize to a first domain of the target mRNA 
sequence (e.g. the terminus of a first exon), and a second or downstream ligation probe that 
comprises a portion that will hybridize to a second domain of the target mRNA sequence (e.g. 
complementary to the terminus of a second exon), adjacent to the first domain, and a second portion 
comprising the downstream universal priming sequence. If perfect complementarity at the junction 
exists, the ligation occurs and then the resulting hybridization complex (comprising the mRNA target 
and the ligated probe) can be separated as above from unreacted probes. Again, the universal 
priming sites are used to amplify the ligated pmbe to form a plurality of amplicons that are then 
detected in a variety of ways, as outlined herein. 



j in addition any of the above embodiments can utilize one or more "adapter sequences" (sometimes 

! referred to In the art as "zip codes") to allow the use of "universal arrays". That is, arrays are generated 

I 15 that contain capture probes that are not target specific, but rather specific to individual artificial adapter 

sequences. One strand of the adapter sequences are added to the target probes (In the case of 
i ligation probes, either probe may contain the adapter sequence), nested between the priming 

sequences, and thus are included in the amplicons. The adapters are then hybridized to the capture 
\ probes on the array, and detection proceeds. In some embodiments, as outlined below, there may be 

120 two adapter sequences used in the target probes. 

^ The present invention provides several significant advantages. The method can be used to detect 

gene expression or alternative splicing events from a single cell or a few cells because of signal 
amplification of annealed probes. It also allows the direct hybridization of the probes to RNA targets, 
thus omitting a cDNA conversion step. Additionally, the hybridization reaction occurs in solution rather 

2 5 than on a surface, so that RNA hybridizes more predictably and with favorable kinetics according to 

their thermodynamic properties. The removal of excess probes and targets allows the isolated targets 
to reflect the level of individual gene expression level or splicing events In cells and the background 
signal (due to non-specific interactions) is reduced. Finally, the use of universal primers avoids biased 
signal amplification in PGR. 

3 0 Accordingly, the present invention provides compositions and methods for detecting, quantifying 

and/or identifying specific polyadenylated mRNA nucleic acid sequences in a sample. As will be 
appreciated by those in the art, the sample solution may comprise any number of things, including, but 
not limited to, bodily fluids (including, but not limited to, blood, urine, serum, lymph, saliva, anal and 
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vaginal secretions, perspiration and semen, of virtually any organism, with mammalian samples being 
preferred and human samples being particularly preferred). The sample may comprise individual 
cells, including primary cells (including bacteria), and cell lines, including, but not limited to, tumor cells 
of all types (particularly melanoma, myeloid leukemia, carcinomas of the lung, breast, ovaries, colon, 
kidney, prostate, pancreas and testes), cardiomyocytes, endothelial cells, epithelial cells, lymphocytes 
CT-cell and B cell) , mast cells, eosinophils, vascular intimal cells, hepatocytes, leukocytes including 
mononuclear leukocytes, stem cells such as haemopoetic, neural, skin, lung, kidney, liver and 
myocyte stem cells; osteoclasts, chondrocytes and other connective tissue cells, keratinocytes, 
melanocytes, liver cells, kidney cells, and adipocytes. Suitable cells also include known research cells, 
including, but not limited to, Jurkat T cells, NIH3T3 cells. CHO, Cos, 923. HeLa, WI-38, Weri-1 , MG- 
63, etc. See the ATCC cell line catalog, hereby expressly incorporated by reference. 

The present invention provides compositions and methods for detecting the presence or absence of 
target mRNA nucleic acid sequences in a sample. "Target sequence" or grammatical equivalents as 
used herein means a polyadenylated mRNA sequence or a secondary target such as an amplicon. 
As is outlined herein, the target sequence may be a polyadenylated mRNA target sequence from a 
sample, or a secondary target such as an amplicon. which is the product of an amplification reaction 
such as PGR. Thus, for example, a polyadenylated mRNA target sequence from a sample is amplified 
to produce an amplicon that is detected. The polyadenylated mRNA target sequence may be any 
length, with the understanding that longer sequences are more specific. As is outlined more fully 
below, probes are made to hybridize to polyadenylated mRNA target sequences to determine the 
presence, absence, quantity or sequence of a polyadenylated mRNA target sequence in a sample. 
Generally speaking, this term will be understood by those skilled in the art. 

The polyadenylated mRNA target sequence may also be comprised of different target domains, that 
may be adjacent (i.e. contiguous) or separated. For example, in the OLA techniques outlined below, a 
first ligation probe may hybridize to a first target domain and a second ligation probe may hybridize to a 
second target domain; either the domains are adjacent, or they may be separated by one or more 
nucleotides, coupled with the use of a polymerase and dNTPs, as is more fully outlined below. The 
terms "first" and "second" are not meant to confer an orientation of the sequences with respect to the 
5"-3' orientation of the target mRNA sequence. For example, assuming a 5'-3' orientation of the 
complementary target sequence, the first target domain may be located either 5' to the second 
domain or 3' to the second domain. In addition, as will be appreciated by those in the art, the probes 
on the surface of the array (e.g. attached to the microspheres) may be attached in either orientation, 
either such that they have a free 3' end or a free 5' end; in some embodiments, the probes can be 
attached at one ore more internal positions, or at both ends. 



If required, the polyadenylated mRNA target sequence is prepared using known techniques. For 
example the sample may be treated to lyse the cells, using known lysis buffers, sonication. 
electroporation, etc.. with purification and amplification as outlined below occurring as needed, as will 
be appreciated by those in the art. In addition, the reactions outlined herein may be accomplished in a 
variety of ways, as will be appreciated by those in the art. Components of the reaction may be added 
simultaneously, or sequentially, in any order, with preferred embodiments outlined below. In addition, 
the reaction may include a variety of other reagents which may be included in the assays. These 
include reagents like salts, buffers, neutral proteins, e.g. albumin, detergents, etc.. which may be used 
to facilitate optimal hybridization and detection, and/or reduce non-specific or background interactions. 
Also reagents that othen/vise improve the efficiency of the assay, such as protease inhibitors, nuclease 
inhibrtors. anti-microbial agents, etc.. may be used, depending on the sample preparation methods 
and purity of the target. 

It should be noted that in some cases, two poly(T) steps are used. In one embodiment, a polyCT) 
support is used to remove unreacted target probes from the sample. However, a polyCO support may 
be used to purify or concentrate poly(A) mRNA from a sample prior to running the assay. For 
example, total RNA may be isolated from a cell population, and then the poly(A) mRNA isolated from 
the total RNA and fed into the assay systems described below. 

In addition, in most embodiments, double stranded target nucleic acids are denatured to render them 
single stranded so as to permit hybridization of the primers and other probes of the invention. A 
preferred embodiment utilizes a thermal step, generally by raising the temperature of the reaction to 
about 95-C, although pH changes and other techniques may also be used. 

AS outlined herein, the invention provides a number of different nucleic acids primers and probes. By 
"nucleic acid" or "oligonucleotide" or grammatical equivalents herein means at least two nucleotides 
covalently linked together. A nucleic acid of the present invention will generally contain 
phosphodiester bonds, although in some cases, as outlined below, particularly for use with probes, 
nucleic acid analogs are included that may have alternate backbones, comprising, for example, 
phosphoramide (Beaucage et al.. Tetrahedron 49(10):1925 (1993) and references therein; Letsinger. 
J org Chem. 35:3800 (1970); Sprinzl et al.. Eur. J. Biochem. 81:579 (1977); Letsinger et al.. Nucl. 
Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 (1984), Letsinger et al.. J. Am. Chem. Soc 
110-4470 (1988); and Pauwels et al., Chemica Scripta 26:141 91986)), phosphorothioate (Mag et al.. 
Nucleic Acids Res. 19:1437 (1991); and U.S. Patent No. 5.644,048), phosphorodithioate (Bnu et al J. 
Am Chem Soc. 111.2321 (1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides 
and Analogues: A Practical Approach, Oxford Universrty Press), and peptide nucleic acid backbones 



and linkages (see Egholm, J. Am. Chem. Soc. 114:1895 (1992); Meier et al.. Chem. ^-^- ^^^ f ^^' 
31-1008 (1992); Nielsen. Nature, 365:566 (1993); Carlsson et al.. Nature 380:207 (1996). all of wh.ch 
are incorporated by reference). Other analog nucleic acids include those with positive backbones 
(Denpcy et al.. Proc. Natl. Acad. Sci. USA 92:6097 (1995); non-ionic backbones (U.S. Patent Nos. 
5 386 023 5 637.684, 5.602,240, 5,216,141 and 4,469.863; Kiedrowshi et al., Angew. Chem. Intl. Ed. 
English 30:423 (1991); Letsinger et al., J. Am. Chem. Soc. 1 10:4470 (1988); Letsinger et al.. 
Nucleoside & Nucleotide 13:1597 (1994); Chapters 2 and 3, ASC Symposium Series 580. 
"Carbohydrate Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker 
et al Bioorganic & Medicinal Chem. Lett. 4:395 (1994); Jeffs et al., J. Biomolecular NMR 34.17 
(1994)- Tetrahedron Lett. 37:743 (1996)) and non-ribose backbones, including those described m U.S. 
Patent NOS. 5,235,033 and 5.034,506, and Chapters 6 and 7. ASC Symposium Senes 580, 
"Carbohydrate Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic 
acids containing one or more carbocyclic sugars are also included within the defin.on of nuc e.c aads 
(see Jenkins etal., Chem. Soc. Rev. (1995) pp1 69-1 76). Several nucleic acid analogs are described 
in Rawls C & E News June 2. 1 997 page 35. The nucleic acids can also be "locked nucle.c acids . 
All of these references are hereby expressly incorporated by reference. These modifications of the 
ribose-phosphate backbone may be done to facilrtate the addition of labels, or to increase the stability 
and half-life of such molecules in physiological environments. 

AS Will be appreciated by those in the art, all of these nucleic acid analogs may find use as probes in 
the present invention. In addition, mixtures of naturally occurring nucleic acids and analogs can be 
made. Alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occunng 

nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic acid analogs. 
~ckbones are substantially non-ionic under neutral conditions, in contrast to the highly charged 
phosphodiester backbone of naturally occurring nucleic acids. This results in two advantages. 
the PNA backbone exhibrts Improved hybridization kinetics. PNAs have larger changes in the melti g 
t mperature (Tm) for mismatched versus perfectly matched basepairs. DNA and RNA typica ly ex -bit 
a 2-4-C drop in Tm for an internal mismatch. Wrth the non-ionic PNA backbone, the drop ,s closer to 
7-9-C. Similarly, due to their non-ionic nature, hybridization of the bases attached to these backbones 
is relatively insensitive to salt concentration. 

The p,ob. nuoleic acids may b. single s^andaO double stranded, as speeded, or con* P»*n. 
e, bo^ deuble stranded or single stranded seduence. Thus. ,or exanrple, when the ^f^-^"^"'^^ 
a p„l,adeh„a>ed mRN/V, the b,bndlza«on complex comprfeing the target probe has a double stranded 
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„<,*„, «he,e ,a-g=. prob. Is h,bM»d, and ona or more single standed potons, — ^ 
poMA PO*n. The nuCeic ao« ma, oonUin an, combl„a,on o, deo.,r,bo- and nb^nuCeotdes. and 
an, Lbin.«on o, bases, Including uracil, adenine, *,n.lne, Cosine, guanine, inosrne, xa»,an,ne 
b,poxa«ranine, isoc„osine, Isoguanine, e.o, A preferred enrbodimen. u«,izes isor^osrrre and 
isoguanme In nuoieic aoids deigned .0 be complemen,a„,o omer ^^■'-'>^"^'" 
sequences, as mis reduces non-spedflc h,b.d,za«on, as is generally desonbed ,n U S Paten, N . 
5 681 702. AS used herein, .he iern, -nucieo^de- includes nucleoldes as well as nucleoside and 
nucleo«de analogs, and modifled nuCeosWes such as amine modl»ed nucleo.des. In adc»on^ 
■nucleoside- includes non-naturally occuring analog structures. Thus ,or example the ,nd„Klual un«s 
0, a peptide nucleic add, each containing a base, are referred to herein as a nucleosKie. 

Probes and primers of the present inven«on are designed to have at least a por«on be --P'" 
L pe„aden,lated mRNA target seguence (e^herthe pol,aden,lated '"-"'^^^'^^'^ 
ample or to .he, probe seguences, such as po*ns o, ampllcons, as . descHbed below), such hat 
hZ-«on o, «,e pol,aden„ated mRNA terge. seguence and the probes of the present rnvento 
or.. AS ou„ined below, this complementarlt, need no. be perfect; there ma, ^-^"J^- , 
base pair mlsnna.ches which will interfere ^ h,brM-.a.on between «re f 
seguence and the single stranded nucleic a<«s of ^e present inven.en. Howeve ^ «^e ™mt. °f 
mutations is so great ,ha, no h,b,«iza«cn can occur under e.en .he least stnngen, of '■V'''"'-^ 
ld«ons, «,e sequence is no. a complemen.a„ pe„aden„a.ed mRNA .arge. seguence. Th s, b^ 
:ubs.an.all, complsmen.^" herein Is mean.that.he probes are suf«cie„.l, '"'^^^Z^; 
p„l,aden,lated mRNA. arget sequences to h,bndlz, under norma, reacon cond*ons, and preferabi, 

give the required specificity. 

A varlet, of h,b,«l^a«on conditions ma, be used in the present inven«on, including hW^moderate 
iri s^n gene, cond«ons; see for example Mania., e, al.. Molecular Coning: A abora^rv 
Manual 2d EdlBon, 1989, and Shod Protocols In Molecular Blolog,, ed. Ausubel, e, al, hereb, 
r^;a.ed b, reterence. S,rlngen. condiUons are sequenoe^epender^ and ..11 be d»,eren ,n 
differenldrcumstences. Longer sequences hybridize specMcall, a, higher tempera.ures. An 
el*e guide to .he h,brldlza.on of nucleic adds is found In Tijssen, Techn^ues ,n ^ 
2 mIcuIo, Blo,ogy-H,bnd,za«on wim Nucleic Add Probes, -Ove^isw of prindples of h,bdd.a.on 
and re s refeg, of nudeic add assa,s- (1993). Generall,, sttlngen. cond«lons are seleded o be 
b„rs-~r .hen .e *erma, mel«ng polh. am, for .he speCc sequence a. a ds,ned ,on,c 
To g«i and pH. The Tm .he tempera.ure (under deflned Ionic strong*. pH and nuderc acK, 
: r.ra.on atwhich SO. o,*eprobescom,ementary.e.e.rge.h,bhd.e.o.he P^^^^^ 

mRNA targe, sequence a. equilibrium (as the «rge, sequences are present ,n excess, at Tm, 50 ^ 



the probes are occupied at equilibrium). Stringent conditions will be those in which the salt 
concentration is less than about 1 .0 M sodium ion, typically about 0.01 to 1 .0 M sodium ion 
concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30-C for short 
probes (e g 10 to 50 nucleotides) and at least about 60-C for long probes (e.g. greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of helix destabilizing agents 
such as formamide. The hybridization conditions may also vary when a non-ionic backbone. ,.e. PNA 
is used, as is known in the art. In addition, cross-linking agents may be added after target binding to 
cross-link. i.e. covalently attach, the two strands of the hybridization complex. 

Thus the assays are generally run under stringency conditions which allows formation of the first 
hybridization complex only in the presence of target. Stringency can be controlled by altering a step 
parameter that is a thermodynamic variable, including, but not limited to, temperature, formam.de 
concentration, salt concentration, chaotropic salt concentration, pH. organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally outlined in U.S. 
Patent No. 5,681 ,697. Thus it may be desirable to perform certain steps at higher stnngency 
conditions to reduce non-specific binding. 

The size of the primer and probe nucleic acid may va^r, as will be appreciated by those in the art with 
each portion of the probe and the total length of the probe in general varying from 5 to 500 nucleotides 
in length Each portion is preferably between 10 and 100 being preferred, between 15 and 50 being 
particularly preferred, and from 10 to 35 being especially preferred, depending on the use and 
amplification technique. Thus, for example, the universal priming sites of the probes are each 
preferably about 15-20 nucleotides in length, with 18 being especially preferred. The adapter 
sequences of the probes are preferably from 15-25 nucleotides in length, with 20 being especially 
preferred. The target specific portion of the probe is preferably from 1 5-50 nucleotides in length, with 
from 30 to 40 being especially preferred. 

Accordingly, the present invention provides first target probe sets. By "probe set" herein is meant a 
plurality of target probes that are used in a particular multiplexed assay. In this context, plurality 
means at least two, with more 10 than being preferred, depending on the assay, sample and purpose 

of the test. 

Accordingly the present invention provides first target probe sets that comprise universal priming 
sites By "universal priming site" herein is meant a sequence of the probe that will bind a PGR pnmer 
for amplification. Each probe preferably comprises an upstream universal priming site (UUP) and a 
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downstream universal priming site (DUP). Again, "upstream" and "downstream are not meant to 
convey a pa^cu.ar S'- 3' orientation, and will depend on the orientation of the system. Pre erably, only 
a Single UUP sequence and a single DUP sequence is used in a probe set, although as w.11 be 
appreciated by those in the art. different assays or different multiplexing analysis may utilize a plur^^ 
of universal priming sequences. In addition, the universal priming sites are preferably located at the 5 
and 3- termini of the target probe (or the ligated probe), as only sequences flanked by pnm.ng 
sequences will be amplified. 

in addition, universal priming sequences are generally chosen to be as unique as possible given the 
particular assays and host genomes to ensure specificity of the assay.ln general. P"-"^ 
sequences range in size from about 5 to about 35 basepairs. with from about 1 5 to about 20 being 

particularly preferred. 

AS will be appreciated by those in the art, the orientation of the two priming sites is different That is. 
one PGR primer will directly hybridize to the first priming site, while the other PGR primer v.1. hybnd.ze 
to the complement of the second priming site. Stated differently, the first priming site is .n sense 
orientation, and the second priming site is in antisense orientation. 

in addition to the universal priming sKes. the target probes comprise at least a first target-specific 
sequence that is substantially complementary to the polyadenylated mRNA target sequence. As 
Zed below, ligation probes each comprise a target-specific sequence. As ^^^^^^^e 
those in the art, the target-specific sequence may taKe on a wide variety of formats depending on 
use of probe. For example, for straight polyadenylated mRNA target sequence detection or en 
expression monitoring or profiling, the target-specific probe sequence comprises a portion tha will 
hybridize to all or part of the polyadenylated mRNA target sequence. In addition, -^b;-^ 
particular polyadenylated mRNA target sequences are preferred, including, but not limrted to. target 
sequences that span splice junctions. 

,n a preferred embodiment, the target specific sequence spans a splice junction of interest. As 
outlined herein, the target specific sequences are designed to be substantial, 

sequences at the end of individual alternative exons. By substantially complementary herein is ..eant 
that the probes are sufficiently complementary to the target sequences to hybridize under normal 
.action conditions, and preferably give the required annealing specificity. Since 
by introns the detection sequences residue on different parts of the RNA molecule. Thus a target 
: sequence is composed of two parts: an upstream portion, complementan. to the terminus of a 
first exon, and a downstream portion, complementary to the terminus of a second exon. Only if 
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splicing has occurred and «,e Intan-ening inUon has been axcfeed - the targe, specitio sequence 
hybridize to the target sequence under the conditions of the assay. 

,„ another emhodiment, two target P-obes are used to a„o« the use o, OLA assay systems and higher 
sp^^U. This .ay be done ,0 detect any polyadenylated mRNA target sequences. sp„ce .unCons, 
etc.. but II finds particular use in the detection of splice junctons. 

The basic OLA method can be run at leasttwo different ways; in a first erpbodiment, on^ one strand of 
ICruence is used as a template f. r S 

and WO 6g,ogs3S. ail of which are incorporated by reference. The discussion below focuses on OLA, 
but as those in the art will appreciate, this can easily be applied to LCR as well. 

,n this embodiment, the targe, probes comprise a. leas, a firs, l*.tlon probe and a -^o"" 

prlbe The me«,od based on the fact that .wo probes can be preferen«alh, l«a.ed together, „ they 

are hybndized to a target strand and if perfeC complementerit, exMs at the ,unC»n. 

in a preferred embodiment, when the assay . done for gene expression ^'"'^^J^^ ^ 
iunc«on analysis) wo ligation probes are designed each with a targe, speciflc poiton. The fi,^, 
Zt n prob! des«ned to be subs«n«,y complemen,.^ ,o a .rs, ,arge, domain and ,he second 
Tg Ob, . subln«al„ complementa^ to a second target domain. As outlined here,n.h a 

I erred embodiment the first and second targe, domains are dIreCy adiaoen,, e^g. ,hey hav no 
ng nucleCdes. In an el,e,na«„e embodlmen,, the nrs, and second target domains are 

mreX adiacent, e.g. ^ey are intervening nucleCdes, and the system Includes a polymerase and 

dfJTPs «,at can be used to -fill in" the gap prior to ligaton. 

K , laroet specISc sequences of the first and second ligation probes are designed 

Zn s spiLd out. By having «,e target probes hybhdize -across- the splice .unCon, or require 

I bet on .he splice iunCon, u.ma,e deteCon of .e splice iunCon Is ------ 

lis embodimen,, .he iunCon o, me probes mey be .he splice lunCon «se,f, or « may be o«se, by 
one or more nucleotides in either direction. 

,„ *fe embodlmen,, a. leas, a flrs. ligation probe hybridized to «,e nrs, «rge, domain and a second 
llZl^ . hybddlzed ,0 *e second ,arge, domain. I, perfec. oemplemen.ar«y e„s.s a, .he 
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,unc«o„ a ligation struCura Is formed such .hat If. p,obas can 1« llgated togett,e, to torn, a 
igated proba. If thfe oon,pleman.a,«, does no, ax.,, no «ga«on st^C-e Is ,om,a<. and ,ha probes 
L no, ligatad togatbar ,0 an appradabia dagraa. This ma, ba done .sing hea, cyoling <o ano» the 
led p obe to be denatured off .he pol,aden„a.ed mRNA .arge, seguenoe suoh ,ba, « ma, serve as 
trpL ,0, ^„he, reasons. In add«on, as . mora ou„lnad belov., me^od ,.a,^so ba 
done using .hree *a«on probes or ligafion probas *a, are separated b, one or more nude tK.es. , 
dNTPs and a polymerase are added (this Is sometimes referred to as -Ganetlo Bit anal,s.s). 

in general, aaoh target spsoitio saguence 0, a ligation probe at leas, about S nuo,eo«des long. «th 
sequenoes of at from about 1 5 ,o 30 being preferred and 20 being especlall, preferred, 

in a preferred embodlmen.. ,hree or mora l^atlon probas are used. This general «ea is depl^ed in 
[ J. 6, ,n tbis embodiment, there Is an intervening liga,on probe, spacitlo to a «,»d domain of the 

target sequence, that is used. 

,n a preferred embodiment, the two ligation target probes are not directly adjacent^ In this 
embodiment they may be separated by one or more bases. The addition of dNTPs and a 
p™ as outlined below for the amp.if.cation reactions, followed by the ligation reaction, allows 

the formation of the ligated probe. 

,„ addMon « the universal pdmlng s»es and «,e target spaoi«c s.,uenoa<s,. ,h. ,arge, prob^ ofthe 

ln..n«on fu,*er comphse one or more adapter seguenoes. An -adapter seguenoe ,s a segu noe, 

generaii, axogeneous ,o .be targe, seguenoes. e.g, a,«clal. *at is designed ,„ use of 

complementar, (and preferabV perfeCh, complement.,) .o a capfure probe » 

adalr seguences aliow «>a creaton of more -unh-ersal- surfaces; ,ba, one s,andard ar^,, 

1 Ing a «e se, of capfure probes can ba made and used In an, applioaSon, The end^^r 

Zmize ,he arra, b, deigning differen, soluble .arge. probes, which, as ..I, ^'^'^-^J-J^-^ 

|„.bea.,lsgeneral.simp,erandlesscos^,,ln^P™..^- 

usually artificial capture probes are made; that .s, the capture probes don P 

known target sequences. The adapter sequences can then be incorporated m the target probes. 

AS will be appreciated by those in the art. the length of the adapter sequences will vary, depending on 
desired "strength" of binding and the number of different adapters desired. In a preferred 

ICUer sequences range from about S to about 50 basepairs in length w.h from about 
S to about 30 being preferred, and from about 1 5 to about 25 being part,cularly preferred. 
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AS - be app,.da.ed by «,os, in the art. .b. placemen, and oi.n,a*>n o, *e adapter ^"'^ 
^;,.da„ depending on me con*ura.on o,ma a.„ and .he ,ssa,«sa„. For example, ,n mo^ f 
be OU^ emboLen. depicted berein, .be adap.e, sequences „e sbo™ on .be ■ups.eam 
pmbe- bowsve,. «,e downs.™am p,obe can aiso be used. Basica,^, as - be apprec 3.a b, ^ose 
„ be art, *e d,«a,en, componen.s o, .be «rge. probes can be piaced in an, order. ,us. as iongas .be 
lersa priming s»es remain on «,e o„.ermos. ends o, «,e probe, .o allow all sequences be^^e. 
Z te be ampL. ,n genera,, .be adap.ar sequences - bave similar bvbHd,za.on cbarao.ens.cs, 
e.g. same or similar mel«ng lemperature, similar (GtC) content 

,„ a preferred embodiment. Mo adapter sequences per ligated terge. probe are used. Tba. Is. as is 
genera,,, deeded in Figure 6. eacb l«a.on probe can comp^e a d«eren. adapter sartuence. Tbe 
g,.ed pro J- tben b,bddize to fo di«erent addresses on tbe arra,; tb. pro-des a lev 1 o, gu 1«, 
Itro, and specitici,,. in add-on. ,t is also possible to use Mo adapter sequences ,or singte target 

probes, if desired. 

in some embodiments, it is possible to use one o, bo«, oftbe universal phmers as an adapter 
sequence Tbat is. one 0. «re universe, prime,, can be used te b,bridi.e to a capture probe on .be 
sice Howeve in tbis embodiment one ot^e unWersal primer, mus. be terge,spec.c; e.g. one 
o^te un^ersal primers . no. real^ ■un.ersal-. bu, ra«,e, one primer .or eacb .arge, mus. allow 
attachment to a different capture probe. 

Tbus me present Invention provK.es terget probes tba. compbse universal phming sequences, targe, 
Ipecinc sequences) and adapter sequences. Tbese target probes are tben added to tbe argat 
sequencesto form b,brkifea«on complexes. As will ba appreciated b, tbose ,n .be art, .be 
bld^a«on complexes contein portons ma. are double s^anded ,0,0 terge.-spe.,ic sequences . 
r a*e. probes b,brtdized te a porOon o, me te^e. mRNA sequence, and po^ons ma. are s.g, 
Tended ,L ends o, me terge. probes compH^ng me universal priming sequences and me ada^er 
requences, and an, unbybrtdized po*n o, me terge. sequence, sucb as pol,(A, te,ls, as ou«,nad 
herein). 



Preferred assay 



formats of the present invention are shown in the Figures. 



once the hybridization complexes are formed, unhybridized probes are removed. This is important as 
ar e pTbes may form some unpredlctab.e structure which wi„ complicate the amp.i ca.on us.ng 
e universal priming sequences. Thus to ensure specifclty (e.g. that target probes d.rected to target 
r; I that are not present in the sample are not amplified and detected), it Is Important to remove 



all the nonhybridized probes. The separation of unhybrldized target probes is done utilizing supports 
comprising polyCO sequences. 

Thus for example, supports (as defined below), particularly magnetic beads, comprising polyCT) 
sequences are added to the mixture comprising the target sequences and the target probes. In th.s 
embodiment, the first hybridization complexes comprise a single-stranded portion comprising a poly(A) 
sequence, generally ranging from 1 0 to 1 00s adenosines. The polyCO support is then used to 
separate the unhybrldized target probes from the hybridization complexes. For example, when 
magnetic beads are used, they may be removed from the mixture and washed; non-magnetc beads 
may be removed via centrifugation and washed, etc. Altematively, the polyCD supports can be packed 
into a column and the assay mixture run through. In a particularly preferred embodiment, the poly-A 
sequence is immobilized to the inside of a tube, i.e. a PGR tube, that is coated with poly(T). The 
hybridization complexes are then released (and denatured) from the beads using a denaturation step 
such as a thermal step. 

Once the non-hybridized probes (and additionally, if preferred, other sequences from the sample that 
are not of Interest) are removed, the hybridization complexes are denatured and the target probes are 
amplified to form amplicons, which are then detected. This can be done in one of several ways, 
including PGR amplification and rolling circle amplification. In addrtion. as outlined below, labels can 
be incorporated into the amplicons in a variety of ways. 

in a preferred embodiment, the target amplification technique is PGR. The polymerase chain reaction 
(PGR) is widely used and described, and involves the use of primer extension combined wrth thermal 
cycling to amplify a target sequence; see U.S. Patent Nos. 4,683,195 and 4,683,202, and PGR 
Essential Data, J. W. Wiley & sons, Ed. G.R. Newton, 1995. all of which are incorporated by reference. 

In general PGR may be briefly described as follows. The double stranded hybridization complex is 
denatured, generally by raising the temperature, and then cooled in the presence of an excess of a 
PGR primer, which then hybridizes to the first universal priming site. A DNA polymerase then acts to 
extend the primer wrth dNTPs, resulting in the synthesis of a new strand forming a hybridization 
complex The sample is then heated again, to disassociate the hybridization complex, and the 
process is repeated. By using a second PGR primer for the complementary target strand that 
hybridizes to the second universal priming site, rapid and exponential amplification occurs. Thus PGR 
steps are denaturation, annealing and extension. The particulars of PGR are well known, and include 
the use of a thermostable polymerase such as Taq I polymerase and thermal cycling. Suitable DNA 
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polymerases include, but are not limited to, the Kienow fragment of DNA polymerase 1, SEQUENASE 
1 .0 and SEQUENASE 2.0 (U.S. Biochemical), T5 DNA polymerase and Phi29 DNA polymerase. 

The reaction is inrtiated by introducing the target probe to a solution comprising the universal primers, 
a polymerase and a set of nucleotides. By "nucleotide" in this context herein is meanta 
deoxynucleoside-triphosphate (also called deoxynucleotides or dNTPs, e.g. dATP. dTTP. dCTP and 
dGTP) in some embodiments, as outlined below, one or more of the nucleotides may comprise a 
detectable label, which may be either a primary or a secondan, label. In addition, the nucleotides may 
be nucleotide analogs, depending on the configuration of the system. Similarly, the primers may 
comprise a primary or secondary label. 

Accordingly, the PGR reaction requires at least one and preferably two PGR primer, a polymerase, 
and a set of dNTPs. As outlined herein, the primers may comprise the label, or one or more of the 
dNTPs may comprise a label. 

,n a preferred embodiment, the amplification reaction utilizes rolling circle amplification. "Rolling circle 
amplification" is based on extension of a circular probe that has hybridized to a target sequence^ A 
polymerase is added that extends the probe sequence. As the circular probe has no terminus the 
polymerase repeatedly extends the circular probe resulting in concatamers o^^he circular probe^ As 
such the probe is amplified. Rolling-circle amplification is generally described m Baner ( 998) 
Nuc Ac/ds Res. 26:5073-5078; Barany. F. (1991) Proc. Ns,. Acs,. Sci. USA 88:189-193; and Lizard. 
et al. (1 998) Nat. Genet. 1 9:225-232, all of which are incorporated by reference in the.r entirety. 

in general, RGA may be described in two ways, as generally depicted in Figures 9 and 10. First as is 
outlined in more detail below, a single target probe is hybridized with a target nucleic acid. Each 
terminus of the probe hybridizes adjacently on the target nucleic acid and the OLA assay as descnbed 
above occurs. When ligated. the probe is circularized while hybridized to the target nucleic acd. 
Addrtion of a polymerase results in extension of the circular probe. However, since the probe has no 
terminus, the polymerase continues to extend the probe repeatedly. Thus results in amplification of 
the circular probe. 

A second alternative approach involves a two step process. In this embodiment, two ligation probes 
are initially ligated together, each containing a universal priming sequence. A rolling circle pnmer is 
then added, which has portions that will hybridize to the universal priming sequences. The presence 
of the ligase then causes the original probe to circularize, using the rolling circle primer as the 
polymerase primer, which is then amplified as above. 
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These embodiments also have the advantage that unligated probes need not necessarily be removed, 
as in the absence of the target, no significant amplification will occur. These benefrts may be 
maximized by the design of the probes; for example, in the first embodiment, when there is a s.ngle 
target probe, placing the universal priming srte close to the 5' end of the probe since this will only serve 
to generate short, truncated pieces, without adapters, in the absence of the ligation reaction. 

Accordingly, in an preferred embodiment, a single oligonucleotide is used both for OLA and as the 
circular template for RCA (referred to herein as a "padlock probe" or a "RCA probe"). That .s, each 
terminus of the oligonucleotide contains sequence complementary to the target nucleic acd and 
functions as an OLA primer as described above. That is, the first end of the RCA probe ,s 
substantially complementary to a first target domain, and the second end of the RCA probe ,s 
substantially complementa^ to a second target domain, adjacent to the first domain. Hybnd.zaton of 
the oligonucleotide to the target nucleic acid results in the formation of a hybridization comp^x. 
Ligation of the "primers" (which are the discrete ends of a single oligonucleotide) results ,n the 
formation of a modified hybridization complex containing a circular probe i.e. an RCA template 
complex. That is, the oligonucleotide is circularized while still hybridized >^th the target nucleic acd. 
This serves as a circular template for RCA. Addition of a primer and a polymerase to the RCA 
template complex results in the formation of an amplicon. 

Labeling of the amplicon can be accomplished in a variety of ways; for example, the polymerase may 
incorporate labelled nucleotides, or alternatively, a label probe is used that is substantially 
complementary to a portion of the RCA probe and comprises at least one label is used, as is generally 

outlined herein. 

The polymerase can be any polymerase, but is preferably one lacking 3' exonuclease activrty (3' exo ). 
Examples of suitable polymerase include but are not limited to exonuclease minus DNA Polymerase 1 
large (Klenow) Fragment, Phi29 DNA polymerase, Taq DNA Polymerase and the like. In addition, in 
some embodiments, a polymerase that will replicate single-stranded DNA (i.e. without a primer 
forming a double stranded section) can be used. 

in a preferred embodiment, the RCA probe contains an adapter sequence as outlined herein, with 
adapter capture probes on the array, for example on a microsphere when microsphere arrays are 
being used. Alternatively, unique portions of the RCA probes, for example all or part of the sequence 
corresponding to the target sequence, can be used to bind to a capture probe. 
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,„ a pr.,s,rsd embodlmen., padlock prob, cantaln. a res.n*„ s«e. The ,es.no«o„ .™io„ucleas. 

i„di,*a, u„«s .ha. h,brtdize s»he, more =«d.n„ o, ,as.o, ,0 surface bound capture P'°"- 
Iwiug RCA «,e product nuc,e,c acid is contacted with the appropnate restncfon endonuCe se. 
CesI in Cleavage o, the product nuCeic a.d into snraiier fragments. Tirs fragments are »,en 
h br^rn the capture probe thatis immobii^ed resu^ng in a concen«a.on 0, product .agments 

microsphere. Again, as ou.ined herein, these Segments can be 
either iabelled nucieoSdes are incorporated during the replicaBon step, or an add«nal label probe 



added. 



Thus in a preferred embodiment, the padlock probe comprises a label sequence; ... a sequence that 
can be used to bind label probes and substantially complementa,, to a label probe. In one 
emtdrment » . pos.ble to use .e same label sequence and label probe for al, padlock probes on 
an array; alternatively, each padlock probe can have a different label sequence. 

The padlock probe also contains a phming site for pnming the RCA reaction. That is, each padlock 
Zbe c mphses a sequence to which a phmer nucleic acid hybddizes forming a templa^ fo, the 
pTlyl as The phme, can be found in an, por.on of the *cuiar probe, in a preferred em^dirnan. 
CpZ IS locatL at a discrete ^e in the probe. In th. embodiment, the phme, s«e ,n each d,^n« 
TadLk probe is identcal, e.g. » a un.e.al phming s»e, a«hough this . no, required. Advantages of 
using primer sites with identical sequences include the ability to use only a single pnmer 

0 onucLtide to prime the RCA assay with a plurality of different hybridization complexes That 

1 patck probe hybridizes uniquely to the target nucleic ac« to which « is deigned. A sing a pnme^ 
hybLesto an Of the unique hybhdiza^n complexes forming a phming * '-«-.~°- ^^'^ 
then proceeds from an «ien«cal locus wHhin each unique padlock probe of the hybndizafon 

complexes. 

in an alternate embodiment, the primer s»e can overlap, encompass, ^'^'^ 
above-described elements of the padlock probe. That is, the pnmer can be found, for example, 
oveTappZr v»ln the rest^on site or the idan«er sequence, in this embodiment, « . necessan. 
that the primer nucleic acid is designed to base pair with the chosen primer s«e. 

Thus «re padlock pmbe of the invention contains at each terminus, sequences conesponding to OLA 
lerThe intellng sequence of .he padlock probe contain In no pa,«cular 
sequence and a restriction endonuCeas, site, in addition, the padlock probe contains a RCA pnming 



site. 
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Thus in a preferred embodiment the OUVRCA is performed in solution followed by restnction 
endonuclease cleavage of the RCA product. The cleaved product is then applied to an array 
comprising beads, each bead comprising a probe complementary to the adapter sequence located .n 
the padlock probe. The amplified adapter sequence correlates with a particular target nucleic acd. 
Thus the incorporation of an endonuclease site allows the generation of short, easily hybrid.zable 
sequences. Furthermore, the unique adapter sequence in each rolling circle padlock probe sequence 
allows diverse sets of nucleic acid sequences to be analyzed in parallel on an array, since each 
sequence is resolved on the basis of hybridization specificity. 

In addition, as will be appreciated by those in the art, other amplification reactions may be used; see 
WO 00/63437, hereby expressly incorporated by reference. 

Thus, the present invention provides for the generation of amplicons (sometimes referred to herein as 

secondary targets). 

,n a preferred embodiment, the amplicons are labeled with a detection label. By "detection label" or 
"detectable label" herein is meant a moiety that allows detection. This may be a primary label or a 
secondary label. Accordingly, detection labels may be primary labels (i.e. directly detectable) or 

secondary labels (indirectly detectable). 

,n a preferred embodiment, the detection label is a primary label. A primary label is one that can be 
directly detected, such as a fluorophore. In general, labels fall into three classes: a) isotopic labels, 
which may be radioactive or heavy isotopes; b) magnetic, electrical, thermal labels; and c) colored or 
luminescent dyes. Labels can also include enzymes (horseradish peroxidase, etc.) and magnefc 
particles Preferred labels include chromophores or phosphors but are preferably fluorescent dyes. 
Surtable dyes for use in the invention include, but are not limited to, fluorescent lanthanide complexes, 
including those of Europium and Terbium, fluorescein, rhodamine, tetramethylrhodamine, eosm, 
erythrosin, coumarin, methyl-coumarins, quantum dots (also referred to as "nanocn^stals": see 
U S S N 09/315,584. hereby incorporated by reference), pyrene, Malacrte green, stilbene. Lucfer 
Yellow cascade Blue™, Texas Red, Cy dyes (Cy3. Cy5, etc.), alexa dyes, phycoe^rthin. bod-PY. and 
others described in the 6th EdWon of the Molecular Probes Handbook by Richard P. Haugland. hereby 
expressly incorporated by reference. 

,n a preferred embodiment, a secondanr detectable label is used. A secondary label is one that is 
indirectly detected; for example, a secondary label can bind or react with a primary label for detection. 
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can act on an additional product to generate a primary label (e.g. enzymes), or may allow the 
separation of the compound comprising the secondary label from unlabeled materials, etc. Secondary 
labels include, but are not limrted to, one of a binding partner pair such as blotin/streptavid.n; 
chemically modifiable moieties; nuclease inhibitors, enzymes such as horseradish peroxidase, alkahne 

phosphatases, lucifierases, etc. 

in a preferred embodiment, the secondary label is a binding partner pair. For example, the label may 
be a hapten or antigen, which will bind its binding partner. For example, suitable binding partner pairs 
include but are not limited to: antigens (such as proteins (including peptides)) and antibodies 
(including fragments thereof (FAbs, etc.)); proteins and small molecules, including biotin/streptavd.n; 
enzymes and substrates or inhibitors; other protein-protein interacting pairs; receptor-ligands; and 
carbohydrates and their binding partners. Nucleic acid - nucleic acid binding proteins pairs are also 
useful In general, the smaller of the pair is attached to the NTP for incorporation into the pnmer. 
Preferred binding partner pairs include, but are not limited to, biotin (or imino-biotin) and streptavdm, 
digeoxinin and Abs, and Prolinx™ reagents (see www.prolinxinc.com/ie4/home.hmtl). 

in a preferred embodiment, the binding partner pair comprises biotin or imino-biotin and a fluorescently 
labeled streptavidin. Imino-biotin is particularly preferred as imino-biotin disassociates from 
streptavidin in pH 4.0 buffer while biotin requires harsh denaturants (e.g. 6 M guanidinium HCI, pH 1 .5 

or 90% formamide at 95°C). 

in a preferred embodiment, the binding partner pair comprises a primary detection label (for example, 
attached to the NTP and therefore to the amplicon) and an antibody that will specifically bind to the 
primary detection label. By "specifically bind" herein is meant that the partners bind with specficrty 
sufficient to differentiate between the pair and other components or contaminants of the system. The 
binding should be sufficient to remain bound under the conditions of the assay, including wash steps to 
remove non-specific binding. In some embodiments, the dissociation constants of the pair will be less 
than about 1 0^-1 0- M"^ . with less than about 1 0- to 1 0- M"^ being preferred and less than about 1 0 - 
10"^ M-^ being particularly preferred. 

in a preferred embodiment, the seconda^r label is a chemically modifiable moiety. In this 
embodiment, labels comprising reactive functional groups are incorporated into the nucleic acid. The 
functional group can then be subsequently labeled with a primary label. Suitable functional groups 
include but are not limited to, amino groups, carboxy groups, maleimide groups, oxo groups and thiol 
groups! with amino groups and thiol groups being particularly preferred. For example, priman. labels 
containing amino groups can be attached to secondary labels comprising amino groups, for example 
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using linkers as are known in the art; for example, homo-or hetero-bifunctional linkers as are well 
known (see 1994 Pierce Chemical Company catalog, technical section on cross-linkers, pages 
155-200, incorporated herein by reference). 

AS outlined herein, labeling can occur in a variety of ways, as will be appreciated by those in the art. In 
general labeling can occur in one of two ways: labels are incorporated into prime,^ such that the 
amplification reaction results in amplicons that comprise the labels or labels are attached to dNTPs 
and incorporated by the polymerase into the amplicons. 

A preferred embodiment utilizes one primer comprising a biotin, that is used to bind a fiuorescentiy 
labeled streptavidin. 

The present invention provides methods and compositions useful in the detection of nucleic acids, 
particularly the labeled amplicons outlined herein. As is more fully outlined below, preferred systems 
of the invention work as follows. Amplicons are attached (via hybridization) to an array site. Th.s 
attachment can be either directly to a capture probe on the surface, through the use of adapters, or 
indirectly, using capture extender probes as outiined herein. In some embodiments, the target 
sequence itself comprises the labels. Alternatively, a label probe is then added, forming an assay 
complex The attachment of the label probe may be direct (i.e. hybridization to a portion of the target 
sequence), or indirect (i.e. hybridization to an amplifier probe that hybridizes to the target sequence), 
with all the required nucleic acids forming an assay complex. 

Accordingly, the present invention provides array compositions comprising at least a first substrate 
v^th a surface comprising individual sites. By "array" or "biochip" herein is meant a plurality of nucle.c 
acids in an array format; the size of the array will depend on the composition and end use of the array. 
Nucleic acids arrays are known in the art, and can be classified in a number of ways; both ordered 
arrays (e g. the ability to resolve chemistries at discrete sites), and random arrays are included. 
Ordered arrays include, but are not limited to, those made using photolithography techniques 
(Affymetrix GeneChip™), spotting techniques (Synteni and others), printing techniques (Hewlett 
Packard and Rosetta), three dimensional "gel pad" arrays, etc. A preferred embodiment utilizes 
microspheres on a variety of substrates including fiber optic bundles, as are outiined in PCTs 
US98/21 193 PCT US99/14387 and PCT US98/05025; WO98/50782; and U.S.S.N.s 09/287.573, 
09/151 877 09/256.943. 09/316,154, 60/119.323, 09/315,584; all of which are expressly incorporated 
by reference. While much of the discussion below is directed to the use of microsphere arrays on fiber 
optic bundles, any array format of nucleic acids on solid supports may be utilized. 
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Arrays containing from about 2 different bioactive agents (e.g. different beads, when beads are used) 
to many millions can be made, with very large arrays being possible. Generally, the array will 
comprise from two to as many as a billion or more, depending on the size of the beads and the 
substrate as well as the end use of the array, thus very high density, high density, moderate density, 
low densily and very low density arrays may be made. Preferred ranges for very high density arrays 
are from about 10,000.000 to about 2,000,000,000, with from about 100,000,000 to about 
1 000 000,000 being preferred (all numbers being in square cm). High density arrays range about 
1 00 000 to about 1 0,000,000, with from about 1 ,000,000 to about 5,000.000 being particularly 
preferred Moderate density arrays range from about 10,000 to about 100,000 being particularly 
preferred, and from about 20.000 to about 50,000 being especially preferred. Low density arrays are 
generally less than 10,000. with from about 1 ,000 to about 5,000 being preferred. Very low density 
arrays are less than 1 ,000, with from about 10 to about 1 000 being preferred, and from about 100 to 
about 500 being particularly preferred. In some embodiments, the compositions of the invention may 
not be in array format; that is, for some embodiments, compositions comprising a single bioactive 
agent may be made as well. In addition, in some arrays, multiple substrates may be used, either of 
different or identical compositions. Thus for example, large arrays may comprise a plurality of smaller 
substrates. 

in addition, one advantage of the present compositions is that particularly through the use of fiber optic 
technology, extremely high density arrays can be made. Thus for example, because beads of 200 pm 
or less (with beads of 200 nm possible) can be used, and very small fibers are known, it is possible to 
have as many as 40,000 or more (in some instances. 1 million) different elements (e.g. fibers and 
beads) in a 1 mm^ fiber optic bundle, with densities of greater than 25.000,000 individual beads and 
fibers (again, in some instances as many as 50-100 million) per 0.5 cm^ obtainable (4 million per 
square cm for 5 u center-to-center and 100 million per square cm for 1 ^J center-to-center). 

By "substrate" or "solid support" or other grammatical equivalents herein is meant any material that 
can be modified to contain discrete individual sites appropriate for the attachment or association of 
beads and is amenable to at least one detection method. As will be appreciated by those in the art, 
the number of possible substrates is very large. Possible substrates include, but are not limited to. 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of 
styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes. Teflon, etc.), 
polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and 
modified silicon, carbon, metals, inorganic glasses, plastics, optical fiber bundles, and a vanety of 
other polymers. In general, the substrates allow optical detection and do not themselves appreciably 
fluoresce. 
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Generally the substrate is flat (planar), although as will be appreciated by those in the art, other 
configurations of substrates rriay be used as well; for example, three dimensional configurations can 
be used for example by embedding the beads in a porous block of plastic that allows sample access 
to the beads and using a confocal microscope for detection. Similarly, the beads may be placed on 
the inside surface of a tube, for flow-through sample analysis to minimize sample volume. Preferred 
substrates include optical fiber bundles as discussed below, and flat planar substrates such as glass, 
polystyrene and other plastics and acrylics. 

In a preferred embodiment, the substrate is an optical fiber bundle or array, as is generally described 
in U S S N s 08/944.850 and 08/519.062. PCT US98/05025. and PCT US98/09163, all of which are 
expressly incorporated herein by reference. Preferred embodiments utilize preformed unitary fiber 
opfic arrays By "preformed unitary fiber optic array" herein is meant an array of discrete individual 
fiber optic strands that are co-axially disposed and joined along their lengths. The fiber strands are 
generally individually clad. However, one thing that distinguished a preformed unita^r array from other 
fiber optic formats is that the fibers are not individually physically manipulatable; that is, one strand 
generally cannot be physically separated at any point along length from another fiber strand. 

Generally, the array of array compositions of the invention can be configured in several ways; see for 
example U S S.N. 09/473.904, hereby expressly incorporated by reference. In a preferred 
embodiment, as is more fully outlined below, a "one component" system is used. That .s, a first 
substrate comprising a plurality of assay locations (sometimes also referred to herein as "assay 
wells") such as a microtiter plate, is configured such that each assay location contains an .nd.v.dual 
array That is. the assay location and the array location are the same. For example, the plastic 
r^aterial of the microtiter plate can be formed to contain a plurality of "bead wells" in the bottom of 
each of the assay wells. Beads containing the capture probes of the invention can then be loaded ,nto 
the bead wells in each assay location as is more fully described below. 

Alternatively, a "two component" system can be used. In this embodiment, the individual arrays are 
formed on a second substrate, which then can be fitted or "dipped" into the first microtrter plate 
substrate A preferred embodiment utilizes fiber optic bundles as the indK^idual arrays, generally wrth 
"bead wells" etched into one surface of each individual fiber, such that the beads containing the 
capture probes are loaded onto the end of the fiber optic bundle. The composite array thus comprises 
a number of individual arrays that are configured to fit within the wells of a microtiter plate. 

By "composite array" or "combination array" or grammatical equK^alents herein is meant a plural^ of 
indK,idual arrays, as outlined above. Generally the number of individual arrays is set by the size of the 
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microtiter plate used; thus, 96 well, 384 well and 1 536 well microtiter plates utilize composite arrays 
comprising 96 384 and 1 536 individual arrays, although as will be appreciated by those in the art, not 
each microtiter well need contain an individual array. It should be noted that the composrte arrays can 
comprise individual arrays that are identical, similar or different. That is, in some embodiments, ,t may 
be desirable to do the same 2,000 assays on 96 different samples; alternatively, doing 1 92,000 
experiments on the same sample (i.e. the same sample in each of the 96 wells) may be desirable. 
Alternatively each row or column of the composite array could be the same, for redundancy/quahty 
control As will be appreciated by those in the art, there are a variety of ways to configure the system, 
in addition, the random nature of the arrays may mean that the same population of beads may be 
added to two different surfaces, resulting in substantially similar but perhaps not identical arrays. 

At least one surface of the substrate is modified to contain discrete, individual sites for later 
association of microspheres. These sites may comprise physically altered sites, i.e. physical 
configurations such as wells or small depressions in the substrate that can retain the beads, such that 
a microsphere can rest in the well, or the use of other forces (magnetic or compressive), or chemically 
altered or active sites, such as chemically functionalized sites, electrostatically altered sites, 
hydrophobically/ hydrophilically functionalized sites, spots of adhesive, etc. 

The sites may be a pattern, i.e. a regular design or configuration, or randomly distributed. A preferred 
embodiment utilizes a regular pattern of sites such that the sites may be addressed in the X-Y 
coordinate plane. "Pattem" in this sense includes a repeating unit cell, preferably one that allows a 
high density of beads on the substrate. However, it should be noted that these sites may not be 
discrete sites. That is. It is possible to use a uniform surface of adhesive or chemical functionalrties, 
for example, that allows the attachment of beads at any position. That is, the surface of the substrate 
is modified to allow attachment of the microspheres at individual sites, whether or not those sites are 
contiguous or non-contiguous with other sites. Thus, the surface of the substrate may be modified 
such that discrete sites are formed that can only have a single associated bead, or alternatively, the 
surface of the substrate is modified and beads may go down anywhere, but they end up at discrete 
sites. 

In a preferred embodiment, the surface of the substrate is modified to contain wells, i.e. depressions in 
the surface of the substrate. This may be done as is generally known in the art using a vanety of 
techniques, including, but not limited to, photolithography, stamping techniques, molding techniques 
and microetching techniques. As will be appreciated by those in the art, the technique used will 
depend on the composition and shape of the substrate. 
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In a preferred embodiment, physical alterations are made in a surface of the substrate to produce the 
sites. In a preferred embodiment, the substrate is a fiber optic bundle and the surface of the substrate 
is a terminal end of the fiber bundle, as is generally described in 08/818.199 and 09/151,877, both of 
which are hereby expressly incorporated by reference. In this embodiment, wells are made in a 
terminal or distal end of a fiber optic bundle comprising individual fibers. In this embodiment, the 
cores of the individual fibers are etched, with respect to the cladding, such that small wells or 
depressions are formed at one end of the fibers. The required depth of the wells will depend on the 
size of the beads to be added to the wells. 

Generally in this embodiment, the microspheres are non-covalently associated in the wells, although 
the wells may additionally be chemically functionalized as is generally described below, cross-linking 
agents may be used, or a physical barrier may be used, i.e. a film or membrane over the beads. 

In a preferred embodiment, the surface of the substrate is modified to contain chemically modified 
sites, that can be used to attach, either covalently or non-covalently, the microspheres of the invention 
to the discrete sites or locations on the substrate. "Chemically modified sites" in this context includes, 
but is not limited to, the addition of a pattern of chemical functional groups including amino groups, 
carboxy groups, oxo groups and thiol groups, that can be used to covalently attach microspheres, 
which generally also contain corresponding reactive functional groups; the addition of a pattern of 
adhesive that can be used to bind the microspheres (either by prior chemical functionalization for the 
addition of the adhesive or direct addition of the adhesive); the addition of a pattern of charged groups 
(similar to the chemical functionalities) for the electrostatic attachment of the microspheres, i.e. when 
the microspheres comprise charged groups opposite to the sites; the addition of a pattern of chemical 
functional groups that renders the sites differentially hydrophobic or hydrophilic, such that the addition 
of similarly hydrophobic or hydrophilic microspheres under suitable experimental conditions will result 
in association of the microspheres to the sites on the basis of hydroaffinity. For example, the use of 
hydrophobic sites with hydrophobic beads. In an aqueous system, drives the association of the beads 
preferentially onto the sites. As outlined above, "pattem" in this sense includes the use of a uniform 
treatment of the surface to allow attachment of the beads at discrete sites, as well as treatment of the 
surface resulting in discrete sites. As will be appreciated by those in the art, this may be accomplished 
in a variety of ways. 

In some embodiments, the beads are not associated with a substrate. That is, the beads are in 
solution or are not distributed on a patterned substrate. 
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In a preferred embodiment, the compositions of the invention further comprise a population of 
microspheres. By "population" herein is meant a plurality of beads as outlined above for arrays. 
Within the population are separate subpopulations, which can be a single microsphere or multiple 
identical microspheres. That is, in some embodiments, as is more fully outlined below, the array may 
contain only a single bead for each capture probe; preferred embodiments utilize a plurality of beads of 
each type. 

By "microspheres" or "beads" or "particles" or grammatical equivalents herein is meant small discrete 
particles. The composition of the beads will vary, depending on the class of capture probe and the 
method of synthesis. Suitable bead compositions include those used in peptide, nucleic acid and 
organic moiety synthesis, including, but not limited to, plastics, ceramics, glass, polystyrene, 
methylstyrene. acrylic polymers, paramagnetic materials, thoria sol, carbon graphite, titanium dioxide, 
latex or cross-linked dextrans such as Sepharose, cellulose, nylon, cross-linked micelles and Teflon 
may all be used. "Microsphere Detection Gu/de" from Bangs Laboratories, Fishers IN is a helpful 

guide. 

The beads need not be spherical; irregular particles may be used. In addition, the beads may be 
porous, thus increasing the surface aroa of the bead available for either capture probe attachment or 
tag attachment. The bead sizes range from nanometers, i.e. 100 nm, to millimeters, i.e. 1 mm, with 
beads from about 0.2 micron to about 200 microns being preferred, and from about 0.5 to about 5 
micron being particularly preferred, although in some embodiments smaller beads may be used. 

It should be noted that a key component of the invention is the use of a substrate/bead pairing that 
allows the association or attachment of the beads at discrete sites on the surface of the substrate, 
such that the beads do not move during the course of the assay. 

Each microsphere comprises a capture probe, although as will be appreciated by those in the art, 
there may be some microspheres which do not contain a capture probe, depending on the synthetic 
methods. 

Attachment of the nucleic acids may be done in a variety of ways, as will be appreciated by those in 
the art, including, but not limited to, chemical or affinity capture (for example, including the 
incorporation of derivatized nucleotides such as AminoLink or biotinylated nucleotides that can then be 
used to attach the nucleic acid to a surface, as well as affinity capture by hybridization), cross-linking, 
and electrostatic attachment, etc. In a preferred embodiment, affinity capture is used to attach the 
nucleic acids to the beads. For example, nucleic acids can be derivatized. for example with one 
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member of a binding pair, and the beads derivatized with the other member of a binding pair. Suitable 
binding pairs are as described herein for IBUDBL pairs. For example, the nucleic acids may be 
biotinylated (for example using enzymatic incorporate of biotinylated nucleotides, for by 
photoactivated cross-linking of biotin). Biotinylated nucleic acids can then be captured on streptavidin- 
coated beads, as is known in the art. Similarly, other hapten-receptor combinations can be used, such 
as digoxigenin and anti-digoxigenin antibodies. Alternatively, chemical groups can be added in the 
form of derivatized nucleotides, that can them be used to add the nucleic acid to the surface. 

Preferred attachments are covalent, although even relatively weak interactions (i.e. non-covalent) can 
be sufficient to attach a nucleic acid to a surface, if there are multiple sites of attachment per each 
nucleic acid. Thus, for example, electrostatic interactions can be used for attachment, for example by 
having beads carrying the opposite charge to the bioactive agent. 

Similarly, affinity capture utilizing hybridization can be used to attach nucleic acids to beads. 

Alternatively, chemical crosslinking may be done, for example by photoactivated crosslinking of 
thymidine to reactive groups, as is known in the art. 

In a preferred embodiment, each bead comprises a single type of capture probe, although a plurality of 
individual capture probes are preferably attached to each bead. Similarly, preferred embodiments 
utilize more than one microsphere containing a unique capture probe; that is, there is redundancy built 
into the system by the use of subpopulations of microspheres, each microsphere in the subpopulation 
containing the same capture probe. 

As will be appreciated by those in the art, the capture probes may either be synthesized directly on the 
beads, or they may be made and then attached after synthesis. In a preferred embodiment, linkers 
are used to attach the capture probes to the beads, to allow both good attachment, sufficient flexibility 
to allow good interaction with the target molecule, and to avoid undesirable binding reactions. 

In a preferred embodiment, the capture probes are synthesized directly on the beads. As is known in 
the art, many classes of chemical compounds are currently synthesized on solid supports, such as 
peptides, organic moieties, and nucleic acids. It is a relatively straightfoiward matter to adjust the 
current synthetic techniques to use beads. 

In a preferred embodiment, the capture probes are synthesized first, and then covalently attached to 
the beads. As will be appreciated by those in the art, this will be done depending on the composition 



-29- 



of Uie cptur. p-obes and t»ads. The funotlona«za«on of so«d support surtacos such as certain 
Z che.,ca,„ .a*e .roups suoK as .... an,lnes. c,.o^s. e.c. . --^a— 

L art Accordingly, "blank- n^lcrospheres rtray be used .ba, have surface cbem^ies «,a, fac, «ate 
aLmen. o , e d..red func«onal,.y b, .be user. Some exa.^es of mese surtace cben„s.nes 
blank n.,c,ospb,res Include, bu, are no. II^Hed .o, amino groups including allpha^c and aromatc 
lines, cartx>x,l acids, aidebydes, amides, cblorome^yl groups, bydrazlde, bydrox^ groups, 
sulfonates and sulfates. 

When random an^ys are used, an encoding/decoding sys,em mus. be used. For example, when 
"era arrays are used, .he beads are generally pu, onto .be subs.a.e randomly; as such «,er 
ITeral ways o correla,e .be func«ona,«y on ,be bead «.b »s ,oca«on. Including *e ,ncorpora.o„ 
Of uZ op.ica, .gnafures, generally «uoresce„. dyes, «,a. could be used .6 lden»fy .he nu.e,c acd 
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product from ona of the methods described herein (e.g. an extended SBE probe, a ligated probe, a 

cleaved signal probe, etc.). 

once the identity (i.e. the actual agent) and location of each microsphere in the array has been fixed, 
the array is exposed to samples containing the target sequences, although as outlined below, this can 
be done prior to or during the analysis as well. The target sequences can hybridize (either directly or 
indirectly) to the capture probes as is more fully outlined below, and results In a change in the optical 
signal of a particular bead. 

in the present invention, "decoding" does not rely on the use of optical signatures, but rather on the 
use of decoding binding ligands that are added during a decoding step. The decoding binding ligands 
will bind either to a distinct identifier binding ligand partner that is placed on the beads, or to the 
capture probe itself. The decoding binding ligands are either directly or indirectiy labeled, and thus 
decoding occurs by detecting the presence of the label. By using pools of decoding binding ligands m 
a sequential fashion, is possible to greatly minimize the number of required decoding steps. 

In some embodiments, the microspheres may additionally comprise identifier binding ligands for use in 
certain decoding systems. By "identifier binding ligands" or "IBLs" herein is meant a compound that 
will specifically bind a corresponding decoder binding ligand (DBL) to facilitate the elucidation of the 
identity of the capture probe attached to the bead. That is. the IBL and the corresponding DBL form a 
binding partner pair. By "specifically bind" herein is meant that the IBL binds its DBL with specificity 
sufficient to differentiate between the corresponding DBL and other DBLs (that is. DBLs for other 
,BLs) or other components or contaminants of the system. The binding should be sufficient to remain 
bound under the conditions of the decoding step, including wash steps to remove non-specific binding, 
in some embodiments, for example when the IBLs and corresponding DBLs are proteins or nucleic 
acids the dissociation constants of the IBL to its DBL will be less than about 10-^-10- M-\ with less 
than about 1 0- to 1 0- M"^ being preferred and less than about 1 0- -1 0- M"^ being particularly 
preferred. 

IBL-DBL binding pairs are known or can be readily found using known techniques. For example, when 
the IBL is a protein, the DBLs include proteins (particularly including antibodies or fragments thereof 
(FAbs etc )) or small molecules, or vice versa (the IBL is an antibody and the DBL is a protein). Metal 
ion- metal ion ligands or chelators pairs are also useful. Antigen-antibody pairs, enzymes and 
substrates or inhibitors, other protein-protein interacting pairs, receptor-ligands. complementary 
nucleic acids, and carbohydrates and their binding partners are also suitable binding pairs. Nucleic 
acid - nucleic acid binding proteins pairs are also useful. Similarly, as is generally descnbed m U.S. 
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Patents 5,270,163, 5,475,096, 5,567.588, 5,595,877, 5,637,459, 5,683,867,5,705.337, and related 
patents, hereby incorporated by reference, nucleic acid "aptamers" can be developed for binding to 
virtually any target; such an aptanner-target pair can be used as the IBL-DBL pair. Similarly, there is a 
wide body of literature relating to the development of binding pairs based on combinatorial chemistry 
methods. 

In a preferred embodiment, the IBL is a molecule whose color or luminescence properties change in 
the presence of a selectively-binding DEL, For example, the IBL may be a fluorescent pH indicator 
whose emission intensity changes with pH. Similarly, the IBL may be a fluorescent ion indicator, 
whose emission properties change with ion concentration. 

Alternatively, the IBL is a molecule whose color or luminescence properties change in the presence of 
various solvents. For example, the IBL may be a fluorescent molecule such as an ethidium salt whose 
fluorescence intensity increases in hydrophobic environments. Similarly, the IBL may be a derivative 
of fluorescein whose color changes between aqueous and nonpolar solvents. 

In one embodiment, the DBL may be attached to a bead, i.e. a "decoder bead", that may carry a label 
such as a fluorophore. 

In a preferred embodiment, the IBL-DBL pair comprise substantially complementary single-stranded 
nucleic acids. In this embodiment, the binding ligands can be referred to as "identifier probes" and 
"decoder probes". Generally, the identifier and decoder probes range from about 4 basepairs in length 
to about 1000. with from about 6 to about 100 being preferred, and from about 8 to about 40 being 
particularly preferred. What is important is that the probes are long enough to be specific, i.e. to 
distinguish between different IBL-DBL pairs, yet short enough to allow both a) dissociation, if 
necessary, under suitable experimental conditions, and b) efficient hybridization. 

In a preferred embodiment, as is more fully outlined below, the IBLs do not bind to DBLs. Rather, the 
IBLs are used as identifier moieties ("IMs") that are identified directly, for example through the use of 
mass spectroscopy. 

Alternatively, in a preferred embodiment, the IBL and the capture probe are the same moiety; thus, for 
example, as outlined herein, particularly when no optical signatures are used, the capture probe can 
serve as both the identifier and the agent. For example, in the case of nucleic acids, the bead-bound 
probe (which serves as the capture probe) can also bind decoder probes, to identify the sequence of 
the probe on the bead. Thus, in this embodiment, the DBLs bind to the capture probes. 
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,n a preferred embodirr^ent. the microspheres may contain an optical signature. That is, as outlined .n 
U S S N s 08/81 8,1 99 and 09/1 51 .877, previous work had each subpopulation of microspheres 
comprising a unique optica, signature or optica, tag that is used to identify the unique capture probe of 
that subpopu.ation of microspheres; that is, decoding utilizes optica, properties of the beads such that 
a bead comprising the unique optica, signature may be distinguished from beads at other locations 
With different optica, signatures. Thus the previous work assigned each capture probe a unique optica. 
Signature such that any microspheres comprising that capture probe are identifiable on the bas,s of e 
Signature. These optica, signatures comprised dyes, usuai.y chromophores or fluorophores. that were 
entrapped or attached to the beads themse.ves. Dh/ersity of optical signatures utilized different 
fluorochromes, different ratios of mixtures of fluorochromes, and different concentrations (.ntens.ties) 



of fluorochromes. 



in a preferred embodiment, the present invention does not re.y so.ely on the use of optical propertes 
to decode the arrays. However, as will be appreciated by those in the art, « is possible in some 
embodiments to utilize optical signatures as an additional coding method, in conjunction with the 
present system. Thus, for example, as is more fu..y outiined be.ow. the size of the array may be 
effectiveiy increased while using a sing.e set of decoding moieties in severa. ways, one of wh.ch ,s the 
ul of optica. Signatures one some beads. Thus, for example, using one "set" of ^'^^^-^^ 
the use of two populations of beads, one with an optica, signature and one without, a.lows the ffective 
doubling of the array size. The use of multiple optical signatures similarly increasesthe possible s.ze 



of the array. 



in a preferred embodiment, each subpopu.ation of beads comprises a p.urality of different .BLs. By 
using a plurality of different IBLs to encode each capture probe, the number of possible un.que codes 
s ubstantiailylcreased. That is. by using one unique IBL per capture probe, the size of the array 
be the number of unique .BLs (assuming no "reuse" occurs, as outiined be.ow). However, by us.ng a 
plurality of different IBLs per bead, n, the size of the array can be increased to 2" wher, the pres ce 
I abslce Of each IBL is used as the indicator. For example, the assignment of 1 0 IB s pe b ad 
generates a 10 b« binary code, where each bit can be designated as "1" (IBL ,s present) or 0 (IBL s 
absent) A 1 0 bit binary code has 2^ possible variants However, as is more fully discussed below, the 
size Of the array may be further increased if another parameter is included such as concentration or 
intensity thus for example, if two different concentrations of the IBL are used, then the array size 
increases as 3". Thus, in this embodiment, each individual capture probe in the array is -^-^^ 
combination of .BLs, which can be added to the beads prior to the addition of the capture probe^ after, 
or during the synthesis of the capture probe, i.e. simultaneous addition of IBLs and capture probe 
components. 
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Alternatively, the combination of different IBLs can be used to elucidate the sequence of the nucle.c 
acid. Thus, for example, using two different IBLs (1BL1 and 1BL2). the f.,.t position of a nuc e.c ac.d 
can be elucidated: for example, adenosine can be represented by the presence of both IBL1 and 1BL2. 
thymidine can be represented by the presence of 1BL1 but not 1BL2, cytosine can be represented by 
the presence of IBL2 but not 1BL1 . and guanosine can be represented by the absence of both. The 
second position of the nucleic acid can be done in a similar manner using 1BL3 and IBL4; thus, the 
presence of 1BL1 , 1BL2. 1BL3 and 1BL4 gives a sequence of AA; IBL1 . 1BL2. and 1BL3 shows the 
sequence AT; IBL1 . 1BL3 and 1BL4 gives the sequence TA, etc. The third position utilizes IBL5 and 
,BL6 etc m this way, the use of 20 different identifiers can yield a unique code for every possible 1 0- 



mer. 



in this way, a sort of "bar code" for each sequence can be constructed; the presence or absence of 
each distinct IBL will allow the identification of each capture probe. 

In addition, the use of different concentrations or densities of IBLs allows a "reuse" of sorte. If, for 
example, the bead comprising a first agent has a 1X concentration of IBL. and a second bead 
comprising a second agent has a 10X concentration of IBL, using saturating concentratons of the 
corresponding labelled DBL allows the user to distinguish between the two beads. 

once the microspheres comprising the capture probes are generated, they are added to the substrate 
to form an array. It should be noted that while most of the methods described here.n add the beads to 
the substrate prior to the assay, the order of making, using and decoding the array can van.. For 
example, the array can be made, decoded, and then the assay done. Alternatively, the array can be 
n.ade used in an assay, and then decoded; this may find particular use when only a few beads need 
be decoded. Alternatively, the beads can be added to the assay mixture, i.e. the sample conta-nrng 
the target sequences, prior to the add-on of the beads to the substrate; after addition and assay, the 
array may be decoded. This is particularly preferred when the sample comprising the beads .s 
ag Jted or mixed; this can increase the amount of target sequence bound to the beads per un.t time, 
and thus (in the case of nucleic acid assays) increase the hybridization kinetics. This may find 
particular use in cases where the concentration of target sequence in the sample is low; generally, for 
low concentrations, long binding times must be used. 

,n general the methods of making the arrays and of decoding the arrays is done to maximize the 
number different candidate agents that can be uniquely encoded. The compositions of t e .nvention 
n.ay be made in a variety of ways. In general, the arrays are made by adding a solution or slurry 
comprising the beads to a surface containing the sites for attachment of the beads. Th,s may be done 
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in a variety of buffers, including aqueous and organic solvents, and mixtures. The solvent can 
evaporate, and excess beads are rennoved. 

In a preferred embodiment, when non-covalent methods are used to associate the beads with the 
array a novel method of loading the beads onto the array is used. This method comprises exposing 
the array to a solution of particles (including microspheres and cells) and then applying energy, e.g. 
agitating or vibrating the mixture. This results in an array comprising more tightly associated particles, 
as the agitation is done with sufficient energy to cause weakly-associated beads to fall off (or out, m 
the case of wells). These sites are then available to bind a different bead. In this way, beads that 
exhibit a high affinity for the srtes are selected. Arrays made in this way have two main advantages as 
compared to a more static loading: first of all. a higher percentage of the sites can be filled eas.ly, and 
secondly the arrays thus loaded show a substantial decrease in bead loss during assays. Thus, m a 
preferred embodiment, these methods are used to generate arrays that have at least about 50% of the 
srtes filled with at least about 75% being preferred, and at least about 90o/. being particularly 
preferred.' Similarly, arrays generated in this manner preferably lose less than about 20% of the beads 
during an assay, with less than about 1 0% being preferred and less than about 5% being particularly 
preferred. 

in this embodiment, the substrate comprising the surface with the discrete sites is immersed into a 
solution comprising the particles (beads, cells, etc.). The surface may comprise wells, as is descnbed 
herein or other types of sites on a patterned surface such that there is a differential affinrty for the 
sites This diffemetial affinity results in a competitive process, such that particles that will assocate 
more tightly are selected. Preferably, the entire surface to be "loaded" with beads is in fluid contact 
with the solution. This solution is generally a slurry ranging from about 1 0,000:1 beads:solution 
(vol-vol) to 1 -1 Generally, the solution can comprise any number of reagents, including aqueous 
buffers organic solvents, salts, other reagent components, etc. In addition, the solution preferably 
comprises an excess of beads; that is. there are more beads than sites on the array. Preferred 
embodiments utilize two-fold to billion-fold excess of beads. 

The immersion can mimic the assay conditions; for example, if the array is to be "dipped" from above 
into a microtiter plate comprising samples, this configuration can be repeated for the loading, thus 
minimizing the beads that are likely to fall out due to gravity. 

Once the surface has been immersed, the substrate, the solution, or both are subjected to a 
competitive process, whereby the particles with lower affinity can be disassociated from the substrate 
and replaced by particles exhibiting a higher affinity to the site. This competitive process is done by 
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the introduction of energy, in the form of heat, sonication, stirring or mixing, vibrating or agitating the 

solution or substrate, or both. 

A preferred embodiment utilizes agitation or vibration. In general, the amount of manipulation of the 
substrate is minimized to prevent damage to the array; thus, preferred embodiments utilize the 
agitation of the solution rather than the array, although either v^ill work. As will be appreciated by those 
in the art. this agitation can take on any number of forms, with a preferred embodiment utilizing 
microtiter plates comprising bead solutions being agitated using microliter plate shakers. 

The agitation proceeds for a period of time sufficient to load the array to a desired fill. Depending on 
the size and concentration of the beads and the size of the array, this time may range from about 1 
second to days, with from about 1 minute to about 24 hours being preferred. 

It should be noted that not all sites of an array may comprise a bead; that is, there may be some sites 
on the substrate surface which are empty. In addition, there may be some sites that contain more 
than one bead, although this is not preferred. 

In some embodiments, for example when chemical attachment is done, it is possible to attach the 
beads in a non-random or ordered way. For example, using photoactivatible attachment linkers or 
photoactivatible adhesK/es or masks, selected sites on the array may be sequentially rendered surtable 
for attachment, such that defined populations of beads are laid down. 

The arrays of the present invention are constructed such that information about the identity of the 
capture probe is built into the array, such that the random deposition of the beads in the fiber wells can 
be "decoded" to allow identification of the capture probe at all positions. This may be done in a variety 
of ways, and either before, during or after the use of the array to detect target molecules. 

Thus, after the array is made, is "decoded" in order to identify the location of one or more of the 
capture probes, i.e. each subpopulation of beads, on the substrate surface. 

,n a preferred embodiment, pyrosequencing techniques are used to decode the array, as Is generally 
described in "Nucleic Acid Sequencing Using Microsphere Arrays", filed October 22, 1999 (no 
U.S.S.N. received yet), hereby expressly incorporated by reference. 

in a preferred embodiment, a selective decoding system is used. In this case, only those 
microspheres exhibiting a change in the optical signal as a result of the binding of a target sequence 
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are decoded. This is commonly done when the number of "hits", i.e. the number of sites to decode, .s 
generally low. That is, the array is first scanned under experimental conditions in the absence of the 
target sequences. The sample containing the target sequences is added, and only those locations 
exhibiting a change in the optical signal are decoded. For example, the beads at either the positve or 
negative signal locations may be either selectively tagged or released from the array (for example 
through the use of photocleavable linkers), and subsequently sorted or enriched in a fluorescence- 
activated cell sorter (FACS). That is, either all the negative beads are released, and then the posrt.ve 
beads are erther released or analyzed in situ, or alternatively all the positives are released and 
analyzed. Alternatively, the labels may comprise halogenated aromatic compounds, and detection of 
the label is done using for example gas chromatography, chemical tags, isotopic tags mass spectral 



tags. 



As will be appreciated by those in the art, this may also be done in systems where the array .s not 
decoded- i e. there need not ever be a correlation of bead composition with location. In this 
embodiment, the beads are loaded on the array, and the assay is run. The "positives", i.e. those 
beads displaying a change in the optical signal as is more fully outlined below, are then marked to 
distinguish or separate them from the "negative" beads. This can be done in several ways, preferably 
using fiber optic arrays. In a preferred embodiment, each bead contains a fluorescent dye. After the 
assay and the identification of the "positives" or "active beads", light is shown down either only the 
posWve fibers or only the negative fibers, generally in the presence of a light-activated reagent 
(typically dissolved oxygen). In the former case, all the active beads are photobieached. Thus, upon 
non-selective release of all the beads with subsequent sorting, for example using a fluorescence 
activated cell sorter (FACS) machine, the non-fluorescent active beads can be sorted from the 
fluorescent negative beads. Alternatively, when light is shown down the negative fibers, all the 
negatives are non-fluorescent and the the postives are fluorescent, and sorting can proceed. The 
characterization of the attached capture probe may be done directly, for example using mass 
spectroscopy. 

Alternatively, the identification may occur through the use of identifier moieties ("IMs"), which are 
similar to IBLs but need not necessarily bind to DBLs. That is, rather than elucidate the structure o 
the capture probe directly, the composition of the IMs may sen,e as the identifier. Thus, for example, 
a specific combination of IMs can sen,e to code the bead, and be used to identify the agent on the 
bead upon release from the bead followed by subsequent analysis, for example using a gas 
chromatograph or mass spectroscope. 
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Alternatively, rather than having each bead contain a fluorescent dye. each bead comprises a non- 
fluorescent precursor to a fluorescent dye. For example, using photocleavable protecting groups, 
such as certain ortho-nitrobenzyl groups, on a fluorescent molecule, photoactivation of the 
fluorochrome can be done. After the assay, light is shown dov.n again erther the "positive" or the 
"negative" fibers, to distinquish these populations. The illuminated precursors are then chemically 
converted to a fluorescent dye. All the beads are then released from the array. v«th sorting, to form 
populations of fluorescent and non-fluorescent beads (either the positives and the negatives or vice 

versa). 

in an alternate preferred embodiment, the s«es of attachment of the beads (for example the v.ells) 
include a photopolymerizable reagent, or the photopolymerizable agent is added to the assembled 
array After the test assay is run, light is shown down again erther the "pos'^ve" or the negative 
fibers', to distinquish these populations. As a result of the irradiation, e.her ail the posrtives or aH the 
negatives are polymerized and trapped or bound to the sites, while the other population of beads can 
be released from the array. 

,n a preferred embodiment, the location of every capture probe is determined ^-"^^ ^'"2"! 
ligands (DBLs). As outiined above. DBLs are binding ligands that will either bind to identifier binding 
ligands. if present, or to the capture probes themselves, preferably when the capture probe is a nucleic 

acid or protein. 

in a preferred embodiment, as outlined above, the DBL binds to the IBL. 

in a preferred embodiment, the capture probes are single-stranded nucleic acids and the DBL is a 
substantially complementary single-stranded nucleic acid that binds (hybridizes) to the capture probe, 
termed a decoder probe herein. A decoder probe that is substantially complementary to each 
candidate probe is made and used to decode the array. In this embodiment, the candidate probes and 
the decoder probes should be of sufficient length (and the decoding step run under surtable 
conditions) to allow specificity; i.e. each candidate probe binds to its corresponding decoder probe with 
sufficient specificity to allow the distinction of each candidate probe. 

,n a preferred embodiment, the DBLs are either directly or indirectly labeled. In a preferred 
embodiment, the DBL is directly labeled, that is, the DBL comprises a label. In alternate 
embodiment, the DBL is indirectly labeled; that is, a labeling binding ligand (LBL) that wiU bind to the 
DBL is used. In this embodiment, the labeling binding ligand-DBL pair can be as descnbed above for 
IBL-DBL pairs. 
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Accordingly, the identification of the location of the individual beads (or subpopulations of beads) is 
done using one or more decoding steps comprising a binding between the labeled DBL and either the 
IBL or the capture probe (i.e. a hybridization between the candidate probe and the decoder probe 
when the capture probe is a nucleic acid). After decoding, the DBLs can be removed and the array 
can be used; however, in some circumstances, for example when the DBL binds to an IBL and not to 
the capture probe, the removal of the DBL is not required (although it may be desirable in some 
circumstances). In addition, as outlined herein, decoding may be done either before the array is used 
to in an assay, during the assay, or after the assay. 

In one embodiment, a single decoding step is done. In this embodiment, each DBL is labeled with a 
unique label, such that the the number of unique tags is equal to or greater than the number of capture 
probes (although in some cases, "reuse" of the unique labels can be done, as described herein; 
similarly, minor variants of candidate probes can share the same decoder, if the variants are encoded 
in another dimension, i.e. in the bead size or label). For each capture probe or IBL, a DBL is made 
that will specifically bind to it and contains a unique tag, for example one or more fluorochromes. 
Thus the identity of each DBL, both its composition (i.e. its sequence when it is a nucleic acd) and its 
label is known. Then, by adding the DBLs to the array containing the capture probes under conditions 
which allow the formation of complexes (termed hybridization complexes when the components are 
nucleic acids) between the DBLs and either the capture probes or the IBLs, the location of each DBL 
can be elucidated. This allows the identification of the location of each capture probe; the random 
array has been decoded. The DBLs can then be removed, if necessary, and the target sample 
applied. 

In a preferred embodiment, the number of unique labels is less than the number of unique capture 
probes and thus a sequential series of decoding steps are used. In this embodiment, decoder probes 
are divided into n sets for decoding. The number of sets corresponds to the number of unique tags. 
Each decoder probe is labeled in n separate reactions with n distinct tags. All the decoder probes 
share the same n tags. The decoder probes are pooled so that each pool contains only one of the n 
tag versions of each decoder, and no two decoder probes have the same sequence of tags across all 
the pools. The number of pools required for this to be true is determined by the number of decoder 
probes and the n. Hybridization of each pool to the array generates a signal at every address. The 
sequential hybridization of each pool in turn will generate a unique, sequence-specific code for each 
candidate probe. This identifies the candidate probe at each address in the array. For example, .f four 
tags are used, then 4 X n sequential hybridizations can ideally distinguish 4" sequences, although in 
some cases more steps may be required. After the hybridization of each pool, the hybrids are 
denatured and the decoder probes removed, so that the probes are rendered single-stranded for the 
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next hybridization (although it is also possible to hybridize limiting amounts of target so that the 
available probe is not saturated. Sequential hybridizations can be carried out and analyzed by 
subtracting pre-existing signal from the previous hybridization). 

An example is illustrative. Assuming an array of 16 probe nucleic acids (numbers 1-16), and four 
unique tags (four different fluors, for example; labels A-D). Decoder probes 1-16 are made that 
correspond to the probes on the beads. The first step is to label decoder probes 1-4 with tag A, 
decoder probes 5-8 with tag B, decoder probes 9-12 with tag C, and decoder probes 13-16 with tag D. 
The probes are mixed and the pool Is contacted with the array containing the beads with the attached 
candidate probes. The location of each tag (and thus each decoder and candidate probe pair) is then 
determined The first set of decoder probes.are then removed. A second set is added, but this time, 
decoder probes 1 , 5, 9 and 13 are labeled v^th tag A, decoder probes 2. 6. 10 and 14 are labeled wrth 
tag B decoder probes 3, 7, 1 1 and 15 are labeled with tag C, and decoder probes 4. 8, 12 and 16 are 
labeled with tag D. Thus, those beads that contained tag A in both decoding steps contain candidate 
probe 1 ; tag A in the first decoding step and tag B in the second decoding step contain candidate 
probe 2; tag A in the first decoding step and tag C in the second step contain candidate probe 3; etc. 

In one embodiment, the decoder probes are labeled in situ; that is, they need not be labeled prior to 
the decoding reaction. In this embodiment, the incoming decoder probe Is shorter than the candidate 
probe creating a 5' "overhang" on the decoding probe. The addition of labeled ddNTPs (each labeled 
with a unique tag) and a polymerase will allow the addition of the tags in a sequence specific manner, 
thus creating a sequence-specific pattem of signals. Similarly, other modifications can be done, 
including ligation, etc. 

In addition, since the size of the array will be set by the number of unique decoding binding ligands. it 
is possible to "reuse" a set of unique DBLs to allow for a greater number of test sites. This may be 
done in several ways; for example, by using some subpopulations that comprise optical signatures. 
Similarly the use of a positional coding scheme within an array; different sub-bundles may reuse the 
set of DBLs Similarly, one embodiment utilizes bead size as a coding modality, thus allowing the 
reuse of the set of unique DBLs for each bead size. Alternatively, sequential partial loading of arrays 
with beads can also allow the reuse of DBLs. Furthermore, "code sharing" can occur as well. 

In a preferred embodiment, the DBLs may be reused by having some subpopulations of beads 
comprise optical signatures. In a preferred embodiment, the optical signature is generally a mixture of 
reporter dyes, preferably fluorescent. By varying both the composition of the mixture (i.e. the ratio of 
one dye to another) and the concentration of the dye (leading to differences in signal intensity). 
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matrices of unique optical signatures may be generated. This may be done by covalentiy attaching the 
dyes to the surface of the beads, or alternatively, by entrapping the dye within the bead. 

in a preferred embodiment, the encoding can be accomplished in a ratio of at least two dyes, although 
more encoding dimensions may be added in the size of the beads, for example. In addition, the labels 
are distinguishable from one another; thus two different labels may comprise different molecules (i.e. 
two different fluors) or. alternatively, one label at two different concentrations or intensity. 

In a preferred embodiment, the dyes are covalentiy attached to the surface of the beads. This may be 
done as is generally outlined for the attachment of the capture probes, using functional groups on the 
surface of the beads. As will be appreciated by those in the art. these attachments are done to 
minimize the effect on the dye. 

In a preferred embodiment, the dyes are non-covalently associated with the beads, generally by 
entrapping the dyes in the pores of the beads. 

Additionally, encoding in the ratios of the two or more dyes, rather than single dye concentrations, is 
preferred since it provides insensitivity to the intensity of light used to interrogate the reporter dye's 
signature and detector sensitivity. 

in a preferred embodiment, a spatial or positional coding system is done. In this embodiment, there 
are sub-bundles or subarrays (i.e. portions of the total array) that are utilized. By analogy with the 
telephone system, each subarray is an "area code", that can have the same tags (i.e. telephone 
numbers) of other subarrays. that are separated by virtue of the location of the subarray. Thus, for 
example, the same unique tags can be reused from bundle to bundle. Thus, the use of 50 unique tags 
in combination with 1 00 different subarrays can form an array of 5000 different capture probes. In this 
embodiment, it becomes important to be able to identify one bundle from another; in general, th.s .s 
done either manually or through the use of marker beads, i.e. beads containing unique tags for each 
subarray. 

In alternative embodiments, additional encoding parameters can be added, such as microsphere size. 
For example, the use of different size beads may also allow the reuse of sets of DBLs; that is. rt .s 
possible to use microspheres of different sizes to expand the encoding dimensions of the 
microspheres. Optical fiber arrays can be fabricated containing pixels with different fiber diameters or 
cross-sections; alternatively, two or more fiber optic bundles, each wrth different cross-secfions of the 
individual fibers, can be added together to form a larger bundle; or. fiber optic bundles with fiber of the 
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same size cross-sections can be used, but just with different sized beads. With different diameters, 
the largest wells can be filled with the largest microspheres and then moving onto progressively 
smaller microspheres in the smaller wells until all size wells are then filled. In this manner, the same 
dye ratio could be used to encode microspheres of different sizes thereby expanding the number of 
different oligonucleotide sequences or chemical functionalities present in the array. Although outlined 
for fiber optic substrates, this as well as the other methods outlined herein can be used with other 
substrates and with other attachment modalities as well. 

In a preferred embodiment, the coding and decoding is accomplished by sequential loading of the 
microspheres into the array. As outlined above for spatial coding, in this embodiment, the optcal 
signatures can be "reused". In this embodiment, the library of microspheres each comprising a 
different capture probe (or the subpopulations each comprise a different capture probe), is divided mto 
a plurality of sublibraries; for example, depending on the size of the desired array and the number of 
unique tags 10 sublibraries each comprising roughly IQo/o of the total libranr may be made, with each 
sublibraiV comprising roughly the same unique tags. Then, the first sublibrary is added to the fiber 
optic bundle comprising the wells, and the location of each capture probe is determined, generally 
through the use of DBLs. The second sublibraiy is then added, and the location of each capture probe 
is again determined. The signal in this case will comprise the signal from the "first" DBL and the 
"second" DBL- by comparing the two matrices the location of each bead in each sublibrary can be 
determined. Similarly, adding the third, fourth, etc. sublibraries sequentially will allow the array to be 
filled. 

in a preferred embodiment, codes can be "shared" in several ways. In a first embodiment, a single 
code (i e IBL/DBL pair) can be assigned to two or more agents if the target sequences different 
sufficiently in their binding strengths. For example, two nucleic acid probes used in an mRNA 
quantitation assay can share the same code if the ranges of their hybridization signal intensities do not 
overlap This can occur, for example, when one of the target sequences is always present at a much 
higher concentration than the other. Alternatively, the two target sequences might always be present 
at a similar concentration, but differ in hybridization efficiency. 

Alternatively a single code can be assigned to multiple agents if the agents are functionally equivalent. 
For example if a set of oligonucleotide probes are designed with the common purpose of detecting 
the presence of a particular gene, then the probes are functionally equivalent, even though they may 
differ in sequence. Similarly, an array of this type could be used to detect homologs of known genes. 
,n this embodiment, each gene is represented by a heterologous set of probes, hybridizing to different 
regions of the gene (and therefore differing in sequence). The set of probes share a common code. If 
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a homolog is present, it might hybridize to some but not all of the probes. The level of homology might 
be indicated by the fraction of probes hybridizing, as well as the average hybridization intensity. 
Similarly, multiple antibodies to the same protein could all share the same code. 

In a preferred embodiment, decoding of self-assembled random arrays is done on the bases of pH 
titration. In this embodiment, in addition to capture probes, the beads comprise optical signatures, 
v^herein the optical signatures are generated by the use of pH-responsive dyes (sometimes referred to 
herein as "ph dyes") such as fluorophores. This embodiment is similar to that outlined in PCT 
US98/05025 and U.S.S.N. 09/151 ,877, both of which are expressly incorporated by reference, except 
that the dyes used in the present ivention exhibits changes in fluorescence intensity (or other 
properties) when the solution pH is adjusted from below the pKa to above the pKa (or vice versa). In a 
preferred embodiment, a set of pH dyes are used, each with a different pKa, preferably separated by 
at least 0.5 pH units. Preferred embodiments utilize a pH dye set of pKa's of 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 
5.0, 5.5, 6.0, 6.5, 7.0, 7.5. 8.0, 8.5, 9.0, 9.5. 1 0.0, 1 0.5, 1 1 , and 1 1 .5. Each bead can contain any 
subset of the pH dyes, and in this way a unique code for the capture probe is generated. Thus, the 
decoding of an array is achieved by titrating the array from pH 1 to pH 1 3, and measuring the 
fluorescence signal from each bead as a function of solution pH. 

Thus the present invention provides array compositions comprising a substrate with a surface 
comprising discrete sites. A population of microspheres is distributed on the sites, and the population 
comprises at least a first and a second subpopulation. Each subpopulation comprises a capture 
probe, and, in addition, at least one optical dye with a given pKa. The pKas of the different optical 

dyes are different. 

In a preferred embodiment, several levels of redundancy are built into the arrays of the invention. 
Building redundancy into an array gives several significant advantages, including the ability to make 
quantitative estimates of confidence about the data and signficant increases in sensitivity. Thus, 
preferred embodiments utilize array redundancy. As will be appreciated by those in the art, there are 
at least two types of redundancy that can be built into an array: the use of multiple identical sensor 
elements (termed herein "sensor redundancy"), and the use of multiple sensor elements directed to 
the same target analyte, but comprising different chemical functionalrties (termed herein "target 
redundancy"). For example, for the detection of nucleic acids, sensor redundancy utilizes of a pluralrty 
of sensor elements such as beads comprising identical binding ligands such as probes. Target 
redundancy utilizes sensor elements with different probes to the same target: one probe may span the 
first 25 bases of the target, a second probe may span the second 25 bases of the target, etc. By 
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building in either or both of these types of redundancy into an array, significant benefits are obtained. 
For example, a variety of statistical mathematical analyses may be done. 

In addition, while this is generally described herein for bead arrays, as will be appreciated by those in 
the art, this techniques can be used for any type of arrays designed to detect target analytes. 

In a preferred embodiment, sensor redundancy is used. In this embodiment, a plurality of sensor 
elements, e.g. beads, comprising identical bioactive agents are used. That is, each subpopulation 
comprises a plurality of beads comprising identical bioactive agents (e.g. binding ligands). By using a 
number of identical sensor elements for a given array, the optical signal from each sensor element can 
be combined and any number of statistical analyses run. as outlined below. This can be done for a 
variety of reasons. For example, in time varying measurements, redundancy can significantly reduce 
the noise in the system. For non-time based measurements, redundancy can significantly increase 
the confidence of the data. 

In a preferred embodiment, a plurality of identical sensor elements are used. As will be appreciated by 
those in the art. the number of identical sensor elements will vary with the application and use of the 
sensor array. In general, anywhere from 2 to thousands may be used, with from 2 to 100 being 
preferred, 2 to 50 being particularly preferred and from 5 to 20 being especially preferred. In general, 
preliminary results indicate that roughly 10 beads gives a sufficient advantage, although for some 
applications, more identical sensor elements can be used. 

Once obtained, the optical response signals from a plurality of sensor beads within each bead 
subpopulation can be manipulated and analyzed in a wide variety of ways, including baseline 
adjustment, averaging, standard deviation analysis, distribution and cluster analysis, confidence 
interval analysis, mean testing, etc. 

In a preferred embodiment, the first manipulation of the optical response signals is an optional 
baseline adjustment. In a typical procedure, the standardized optical responses are adjusted to start 
at a value of 0.0 by subtracting the integer 1 .0 from all data points. Doing this allows the baseline-loop 
data to remain at zero even when summed together and the random response signal noise is 
canceled out. When the sample is a fluid, the fluid pulse-loop temporal region, however, frequently 
exhibits a characteristic change in response, either positive, negative or neutral, prior to the sample 
pulse and often requires a baseline adjustment to overcome noise associated with drift in the first few 
data points due to charge buildup in the CCD camera. If no drift is present, typically the baseline from 
the first data point for each bead sensor is subtracted from all the response data for the same bead. If 
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drift is observed, the average baseline from the first ten data points for each bead sensor is 
substracted from the all the response data for the same bead. By applying this baseline adjustment, 
when multiple bead responses are added together they can be amplified while the baseline remains at 
zero. Since ail beads respond at the same time to the sample (e.g. the sample pulse), they all see the 
pulse at the exact same time and there is no registering or adjusting needed for overlaying their 
responses. In addition, other types of baseline adjustment may be done, depending on the 
requirements and output of the system used. 

Once the baseline has been adjusted, a number of possible statistical analyses may be run to 
generate known statistical parameters. Analyses based on redundancy are known and generally 
described in texts such as Freund and Walpole, Mathematical Statistics, Prentice Hall, Inc. New 
Jersey, 1980, hereby incorporated by reference in its entirety. 

In a preferred embodiment, signal summing is done by simply adding the intensity values of all 
responses at each time point, generating a new temporal response comprised of the sum of alt bead 
responses. These values can be baseline-adjusted or raw. As for all the analyses described herein, 
signal summing can be performed in real time or during post-data acquisition data reduction and 
analysis. In one embodiment, signal summing is performed with a commercial spreadsheet program 
(Excel, Microsoft, Redmond, WA) after optical response data is collected. 

In a preferred embodiment, cummulative response data is generated by simply adding all data points 
in successive time intervals. This final column, comprised of the sum of all data points at a particular 
time interval, may then be compared or plotted with the individual bead responses to determine the 
extent of signal enhancement or improved signal-to-noise ratios. 

in a preferred embodiment, the mean of the subpopulation (i.e. the plurality of identical beads) is 
determined, using the well known Equation 1 : 

Equation 1 

1^1- 

In some embodiments, the subpopulation may be redefined to exclude some beads if necessary (for 
example for obvious outliers, as discussed below). 
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In a preferred embodiment, the standard deviation of the subpopuiation can be determined, generally 
using Equation 2 (for the entire subpopuiation) and Equation 3 (for less than the entire subpopuiation) 

Equation 2 



Equation 3 



As for the mean, the subpopuiation may be redefined to exclude some beads if necessary (for 
example for obvious outliers, as discussed below). 

in a preferred embodiment, statistical analyses are done to evaluate whether a particular data point 
has statistical validity within a subpopuiation by using techniques including, but not limited to, t 
distribution and cluster analysis. This may be done to statistically discard outliers that may othemise 
skew the result and increase the signal-to-noise ratio of any particular experiment. This may be done 
using Equation 4: 

Equation 4 



in a preferred embodiment, the quality of the data is evaluated using confidence intervals, as is known 
in the art. Confidence intervals can be used to facilitate more comprehensive data processing to 
measure the statistical validity of a result. 

In a preferred embodiment, statistical parameters of a subpopuiation of beads are used to do 
hypothesis testing. One application is tests concerning means, also called mean testing. In this 
application, statistical evaluation is done to determine whether two subpopulations are different. For 



-46- 



example, one sample could be compared with another sample for each subpopulation within an array 
to determine if the variation is statistically significant. 



In addition, mean testing can also be used to differentiate two different assays that share the same 
code. If the two assays give results that are statistically distinct from each other, then the 
subpopulations that share a common code can be distinguished from each other on the basis of the 

assay and the mean test, shown below in Equation 5: 

Equation 5 



Furthermore, analyzing the distribution of individual members of a subpopulation of sensor elements 
may be done. For example, a subpopulation distribution can be evaluated to determine whether the 
distribution is binomial, Poisson, hypergeometric, etc. 

In addition to the sensor redundancy, a preferred embodiment utilizes a plurality of sensor elements 
that are directed to a single target analyte but yet are not identical. For example, a single target 
nucleic acid analyte may have two or more sensor elements each comprising a different probe. This 
adds a level of confidence as non-specific binding interactions can be statistically minimized. When 
nucleic acid target analytes are to be evaluated, the redundant nucleic acid probes may be 
overlapping, adjacent, or spatially separated. However, it is preferred that two probes do not compete 
for a single binding site, so adjacent or separated probes are preferred. Similarly, when proteinaceous 
target analytes are to be evaluated, preferred embodiments utilize bioactive agent binding agents that 
bind to different parts of the target. For example, when antibodies (or antibody fragments) are used as 
bioactive agents for the binding of target proteins, preferred embodiments utilize antibodies to different 
epitopes. 

In this embodiment, a plurality of different sensor elements may be used, with from about 2 to about 
20 being preferred, and from about 2 to about 10 being especially preferred, and from 2 to about 5 
being particularly preferred, including 2, 3, 4 or 5. However, as above, more may also be used, 
depending on the application. 

As above, any number of statistical analyses may be run on the data from target redundant sensors. 
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one benefit of the sensor element summing (referred to herein as "bead summing" when beads are 
used), is the increase in sensitivity that can occur. 

As will be appreciated by those in the art, the present invention finds use in a wide variety of 
applications. 

In a preferred embodiment, the present invention finds use in gene expression monitoring and profiling 
in any mRNA sample, and in particular in the comparison of different cellular states, including the 
comparison of diseased tissue such as cancerous tissue with normal tissue. 

In a preferred embodiment, the present invention finds use in alternative splice analysis, including both 
discovery of splice junctions and detection of alternative splice junctions. In this embodiment, the 
invention finds use in the correlation of altematlve splicing patterns to morphological and clinical 
parameters of cancer. Similarly, the mechanisms and regulation of constitutive and alternative splicing 
can be elucidated in a variety of cell types, as outlined above. 

Specific splicing targets are shown below in Table 1 (followed by the references). 

Table!. Selected Splicing targets 





Gene Name 


Functional Significance 


Selected 
Reference 


1 


acelylcholinesteras 


synapse maturation 


1 


2 


agrin 


AChR clustering at synapses 


2 


3 


AMLl 


transcriptional activation 


3 


4 


ASF/SF2 


iTiRNA splicing regulation 


4 


5 


Bcl-x 


apoptosis regulation 


5 


6 


BRCAl 


transcriptional activation 


6 


7 


c-src 


signal transduction regulation 


7 


8 


calcium channel, alpha 1 A 


neurotransmitterrelease 


8 


9 


calcium channel, alpha IB 


neurotransmitterrelease 


9 


10 


caspase 1 (ICE) 


apoptotic regulation 


10 
11 
12 


11 


caspase 2 (Ich-1) 


apoptotic regulation 


12 


CD45 


T cell maturation 




clathiin light chain B 


receptor-mediatedendocytosis 


13 
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Gene Name 



Functional Significance 



14 cytochrome P450, aromatase 

1 5 I estrogen receptor 1 

16 1 estrogen receptor 2 

17 I fas ligand 

18 I FGFR2 

1 9 fibronectin 1 

20 I fyn 

21 I glutamate receptor NMD ARl 

22 I Hel-Nl 

23 I insulin receptor 

24 I integrinbeta 

25 I jun kinase 2 

26 I MUCl 

27 1 myosin heavy chain 

28 I NCAM 

29 nNos 

30 I p 1 5 CDK inhibitor 2B (INK 4b) 

31 I p 1 6 CDK inhibitor 2A (ARF) 

32 I presenilin2 

33 I prostate-specificantigen 

34 I SMN 

35 SRp40 

36 tau 

37 telomerase 

38 transformer 2 beta 

39 I tropomyosin 1 (alpha) 

40 I tropomyosin 2 (beta) 

41 I troponin T3 

42 I VEGFR-1 

43 I Wilms tumor 1 



steroid hormone synthesis 
hormone response 
hormone response 
apoptotic regulation 
signal transduction 
wound healing 
growth control 
neurotransmission 
mRN A turnover 
signal transduction 
cell adhesion 
transcriptionalcofactor 
cell surface tumor marker 
muscle contraction 
cell adhesion 
neurotransmission 
cell cycle control 
cell cycle control 
apoptotic regulation 
cell surface tumor marker 
spinal motor neuron survival 
mRNA splicing regulation 
neuronal maturation 

chromosomal integrity and grov^ control 
mRNA splicing regulation 
muscle contraction 
muscle contraction 
muscle contraction 

angiogenesis and vascular permeability 
transciptional regulation 



Selected 
Reference 



14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 
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